Polymerase compositions and methods of making and using same

ABSTRACT

The present disclosure provides compositions, methods, kits, systems and apparatus that are useful for nucleic acid polymerization. In particular, modified polymerases and biologically active fragments thereof, such as modified Taq polymerases, are provided that allow for improved nucleic acid amplification. In some aspects, the disclosure provides modified polymerases having improved thermostability, accuracy, processivity and/or read length as compared to a referenceTaq polymerase. In some aspects, the disclosure relates to modified polymerases or biologically active fragments thereof, useful for amplification methods, and in practically illustrative embodiments, emulsion PCR.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a Continuation of U.S. application Ser. No.15/961,206, filed Apr. 24, 2018, now allowed, which is incorporatedherein by reference in its entirety and which is a Division of U.S.application Ser. No. 14/970,818, filed Dec. 16, 2015, now U.S. Pat. No.9,976,178, issued May 22, 2018, which claims benefit of U.S. ProvisionalApplication Ser. No. 62/092,756, filed Dec. 16, 2014, the disclosure ofeach of which is incorporated herein by reference in its entirety.

SEQUENCE LISTING

The instant application contains a Sequence Listing which has beensubmitted electronically in ASCII format and is hereby incorporated byreference in its entirety. Said ASCII copy, created on Dec. 11, 2015, isnamed LT00925_SL.txt and is 247,676 bytes in size.

FIELD OF THE INVENTION

The present invention generally relates to mutant polymerases withimproved properties, for example mutant Taq polymerases, as well asnucleic acids encoding the same, and methods and kits using the same.

BACKGROUND

The ability of enzymes to catalyze biological reactions is fundamentalto life. A range of biological applications use enzymes to synthesizevarious biomolecules in vitro. One particularly useful class of enzymesis the polymerases, which can catalyze the polymerization ofbiomolecules (e.g., nucleotides or amino acids) into biopolymers (e.g.,nucleic acids or peptides). For example, polymerases that can polymerizenucleotides into nucleic acids, particularly in a template-dependentfashion, are useful in recombinant DNA technology and nucleic aciddetection and nucleic acid sequencing applications. Many nucleic acidsequencing methods monitor nucleotide incorporations during in vitrotemplate-dependent nucleic acid synthesis catalyzed by a polymerase.Single Molecule Sequencing (SMS) and Paired-End Sequencing (PES)typically include a polymerase for template-dependent nucleic acidsynthesis. Polymerases are also useful for the generation of nucleicacid libraries, such as nucleic acid libraries created during emulsionPCR or bridge PCR. Nucleic acid libraries created using such polymerasescan be used in a variety of downstream processes, such as genotyping,nucleotide polymorphism (SNP) analysis, copy number variation analysis,epigenetic analysis, gene expression analysis, hybridization arrays,analysis of gene mutations including but not limited to detection,prognosis and/or diagnosis of disease states, detection and analysis ofrare or low frequency allele mutations, and nucleic acid sequencingincluding but not limited to de novo sequencing or targetedresequencing.

A desirable quality of a polymerase useful for nucleic acidamplification, synthesis and/or detection is improved incorporation ofnucleotides as compared to a reference polymerase. Improved nucleotideincorporation can make processes such as nucleic acid librarypreparation and/or DNA sequencing more cost effective by reducing thenumber of nucleic acid templates necessary to sequence a desired targetmolecule. In another aspect, improved nucleotide incorporation ascompared to a reference polymerase can also reduce the number ofsequencing reads required to determine the sequence of the desiredtarget molecule.

Additionally, improved nucleotide incorporation (as compared to areference polymerase) can also improve signal uniformity, leading toincreased accuracy in base determination of the desired target molecule.In yet another aspect, improved nucleotide incorporation by a modifiedpolymerase as compared to a reference polymerase can increase the readlength of the desired target molecule and thus reduces the likelihood ofthe modified polymerase stalling or dissociating from the desired targetmolecule. In yet another aspect, a modified polymerase having improvedtemplating or clonal amplification efficiency as compared to a referencepolymerase and thus can improve downstream sequencing of a targetmolecule that is customarily considered a “difficult” target molecule,such as a target molecule with high GC or AT content. As such, oneaspect of invention is to provide a method, system, apparatus, andcompositions of matter that improve GC and AT bias in nucleic acidamplification using a modified polymerase having a reduced GC or ATcontent bias.

Another desirable quality in an enzyme used in nucleic acid librarypreparation or DNA sequencing is thermal stability. DNA polymerasesexhibiting thermal stability have revolutionized many aspects ofmolecular biology and clinical diagnostics since the development of thepolymerase chain reaction (PCR), which uses cycles of thermaldenaturation, primer annealing, and enzymatic primer extension toamplify DNA templates. A prototype thermostable DNA polymerase used inthe initial PCR experiments was Taq DNA polymerase, originally isolatedfrom the thermophilic eubacterium Thermus aquaticus.

There are three major families of DNA polymerases, termed families A, Band C. The classification of a polymerase into one of these threefamilies is based on structural similarity of a given polymerase to E.coli; DNA polymerase I (Family A), II (Family B) or III (family C). Asexamples, Family A DNA polymerases include, but are not limited toKlenow DNA polymerase, Thermus aquaticus DNA polymerase I (Taqpolymerase) and bacteriophage T7 DNA polymerase; Family B DNApolymerases, formerly known as α-family polymerases (Braithwaite andIto, 1991, Nuc. Acids Res. 19:405), include, but are not limited tohuman α, δ and ε DNA polymerases, T4, RB69 and φ29 bacteriophage DNApolymerases, and Pyrococcus furiosus DNA polymerase (Pfu polymerase);and family C DNA polymerases include, but are not limited to Bacillussubtilis DNA polymerase III and E. coli DNA polymerase III and subunits(listed as products of the dnaE and dnaQ genes, respectively, byBraithwaite and Ito. 1993, Nucleic Acids Res. 21: 787). An alignment ofDNA polymerase protein sequences of each family across a broad spectrumof archaeal, bacterial, viral and eukaryotic organisms is presented inBraithwaite and Ito (1993, supra), which is incorporated herein byreference in its entirety.

When performing polymerase-dependent nucleic acid synthesis oramplification, it can be useful to modify the polymerase (for examplevia mutation or chemical modification) so as to alter its catalyticproperties. In some instances, it can be useful to modify the polymeraseto enhance its catalytic properties. In some embodiments, it can beuseful to enhance a polymerase's catalytic properties via site-directedamino acid substitution or deletion. In some embodiments, it can beuseful to enhance a polymerase's catalytic properties viasite-saturation mutagenesis of one, a plurality, or each, amino acid ofthe polymerase. In some embodiments, modification of a polymerase may beperformed to enhance catalytic properties of the modified polymerasesuch as read length, accuracy, and/or processivity.

Polymerase performance in various biological assays involving nucleicacid synthesis or detection can be limited by the behavior of thepolymerase towards nucleotide substrates, salt concentrations, orthermostable conditions. For example, analysis of polymerase activitycan be complicated by undesirable behavior such as the tendency of agiven polymerase to dissociate from the template; to bind and/orincorporate the incorrect, e.g., non Watson-Crick base-paired,nucleotide; or to release the correct, e.g., Watson-Crick based paired,nucleotide without incorporation. Additionally, analysis of polymeraseactivity can be complicated by undesirable behavior of a target moleculefrom fully denaturing, such as in high AT and GC rich regions orpremature attenuation of the target molecule. As demonstrated herein,desirable polymerase properties for improved nucleic acid amplificationcan be achieved via suitable selection, engineering and/or modificationof a polymerase of choice. For example, such modification can beperformed to favorably alter the polymerase's affinity of binding totemplate, processivity, accuracy of nucleotide incorporation, strandbias, and coverage. Such alterations within the polymerase can alsoincrease the amount of sequence information and/or quality of sequencinginformation obtained directly, or downstream, from the improvedamplification workflow utilizing such a modified polymerase.

There remains a need in the art for improved polymerase compositions(and related methods, systems, apparatuses, and kits) exhibiting alteredproperties, e.g., increased processivity, increased read length(including error-free read length), increased accuracy and/or affinityfor DNA template, increased coverage, decreased strand bias and/ordecreased systematic error. Such polymerase compositions (and relatedmethods, systems, apparatuses, and kits) can be useful in a wide varietyof assays involving polymerase-dependent nucleic acid synthesis,including nucleic acid sequencing and/or the production of nucleic acidlibraries, such as nucleic acid libraries prepared by bridge PCR orclonal amplification.

SUMMARY OF THE INVENTION

The present invention in certain embodiments provides a composition thatincludes an isolated polypeptide, as well as isolated nucleic acids andvectors encoding the same, having at least 50, 75, 100, 150, 175, 200,250, 300, 350, 400, 500, 600, 700, or 800 contiguous amino acid residueshaving at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, and 99%,identity to SEQ ID NO: 1 or SEQ ID NO: 34, or a biologically activefragment thereof, wherein the polypeptide exhibits polymerase activity.In exemplary embodiments, the isolated polypeptide exhibits animprovement relative to a reference polymerase of SEQ ID NO:1 and/or SEQID NO:34, in one or more properties selected from thermostability and/ora sequencing property selected from read length, accuracy, strand bias,systematic error, and total sequencing throughput. In certainembodiments, the isolated polypeptide includes one or more amino acidsubstitutions selected from the group consisting of P6N, A77E, A97V,L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V,G418C, L490Q, A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W,V737A, E745T, L763F, E790G, E794C, E805I and L828A. The sequencingproperties in certain embodiments, are determined by using the isolatedpolypeptide in an emulsion PCR template amplification reaction, incertain illustrative embodiments in the presence of 125 mM KCl or 125 mMNaCl, during sample preparation of a nucleic acid sequencing reaction.In certain embodiments, the sequencing property is analyzed using anext-generation (i.e. massively parallel, high throughput) sequencingworkflow, such as an Ion Torrent (Life Technologies, Carlsbad, Calif.)sequencing workflow, as exemplified herein. In certain aspects, theisolated polypeptide as well as a modified polymerase used in a methodembodiment provided herein, has improved thermostability at 95° C., 96°C., or 97° C. for 2 minutes, 4 minutes, and in illustrative examples 6minutes as compared to the thermostability of SEQ ID NO: 1 at 95° C. forthe same time period and temperature. In illustrative examples, thethermostability can be tested by incubating the on-test and controlpolymerase under identical conditions that include elevatedtemperatures, for example 95° C., 96° C., or 97° C. for 2 minutes, 4minutes, and in illustrative examples 6 minutes in an incubation bufferthat includes, for example 15 mM Tris pH 7.5, 100 mM KCl, 30% Trahalose,0.1% NP40, and 50 mM polymerase enzyme. After incubation at elevatedtemperature, the solutions can optionally be placed on ice and thentransferred to an enzyme reaction mixture that includes 15 mM Tris pH7.5, 100 mM KCl, 8 mM MgCl₂, 150 nm Oligo 221 and 5 nM of polymerasereaction mixture from the heat-treatment step (10 ul). Oligo 221 is ahairpin oligo with a fluorescent dye attached(TTTTTTTGCAGGTGACAGGTTTTTCCTGTCACCXGC (SEQ ID NO: 50), where X is afluorescein-dT residue). Upon addition of dATP, oligo 221 is extended,resulting in release of the florescence. Accordingly, as a non-limitedexample, the thermostability can be tested using the method provided inExample 10 as outlined in FIG. 14) herein. In certain illustrativeembodiments, the isolated polypeptide has improved thermostability at95° C. for 6 minutes as compared to the thermostability of SEQ ID NO: 1at 95° C. for 6 minutes. In certain illustrative embodiments of theseaspects, the thermostable isolated polypeptide, or the biologicallyactive fragment thereof, includes G418C or E397V. In yet furtherembodiments, in addition to a G418C or in particular aspects, an E397Vmutation, the isolated peptide further includes one or more amino acidsubstitutions selected from the group consisting of E745T, L763F andE805I, wherein the numbering is relative to SEQ ID NO: 1. In certainaspects, the composition includes a reagent for a hot start activationmechanism, such as an oligonucleotide and/or an aptamer. In otheraspects, the isolated polypeptide is chemically modified to provide ahot start mechanism.

In one embodiment of the invention one or more properties exhibited bythe isolated polypeptide or the biologically active fragment thereof ofthe composition include at least two, three, four, five, six, or allsequencing workflow properties selected from increased AQ20 mean readlength reads, reduced strand bias, increased base coverage, increasedaccuracy, increased sequencing throughput (Mb) and increased uniformityof coverage, relative to a reference polymerase having a sequence of SEQID NO: 34 and/or SEQ ID NO: 1. In some embodiments the isolatedpolypeptide or biologically active fragment thereof, where one mutationis E397V, another mutation is P6N, E745T and/or L763F. In anotherembodiment, that may or may not include E397V, the mutations includeL763F and/or E805I, P6N and/or E295F, or E745T and/or E794C.

In a further embodiment of the invention the one or more properties ofthe composition are exhibited when or analyzed or tested by performingan emulsion PCR template amplification reaction on a library constructedfrom a template having a GC content of 65%. In certain embodiments thereference polymerase is SEQ ID NO:34, and in certain particularlyillustrative embodiments, the reference polymerase is SEQ ID NO:1.

In one embodiment of the inventive composition, the isolated polypeptideor biologically active fragment thereof includes a mutation selectedfrom A77E, A97V, K240I, L287T, or K292C relative to a referencepolymerase having a sequence of SEQ ID NO:1 and in exemplary aspects ofthis embodiment, the one or more properties include a sequencingproperty analyzed using a high throughput nucleic acid sequencingreaction where the polypeptide or biologically active fragment thereofis used to perform an emulsion PCR template amplification reaction on alibrary constructed from a template with a GC content of 65%.

In another embodiment the isolated polypeptide of the compositionincludes SEQ ID NO: 2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ IDNO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO: 10, SEQ ID NO:11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ IDNO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25,SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO:30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33. It will be understoodthat in illustrative embodiments of the present invention, includingcomposition and method embodiments that include an isolated polypeptideor modified polymerase provided herein, the isolated polypeptide ormodified polymerase can be analyzed to determine whether it possessescertain properties, activities, or characteristics using an emulsion PCRreaction to amplify templates as part of a sequencing workflow, forexample to amplify templates on a solid support, and in someillustrative embodiments, to clonally amplify templates on a solidsupport. The nucleic acid sequence of at least a portion of theamplified templates is then determined. This sequence determination inillustrative embodiments is performed using a high throuput sequencingplatform such as Ion Torren PGM, as exemplified herein. The results ofthis sequence determination are compared to results of similarexperiments performed using a reference polymerase, such as Taqpolymerase (SEQ ID NO:1) or the modified Taq polymerase of SEQ ID NO:34,for an emulsion PCR template amplification step in a high throughputsequencing reaction. In one aspect the test for an isolated polypeptideor a mutant polymerase includes amplifying a library of nucleic acidmolecules using emulsion PCR for both an on-test and a referencepolymerase, onto a nucleic acid capture support such as Ion Sphere™particles. The amplified nucleic acid molecules in this embodiment, canthen be loaded into a PGM™ 314 sequencing chip, which can then be loadedinto an Ion Torrent PGM™ Sequencing system and sequenced. Sequencingresults for the on-test and the reference polymerase can then becompared.

In another embodiment, provided herein is a method (and related kits,apparatuses, systems and compositions) for amplifying a nucleic acid,that includes contacting the nucleic acid with a modified polymerase, ora biologically active fragment thereof, under suitable conditions foramplifying the nucleic acid, and amplifying the nucleic acid, whereinthe modified polymerase or the biologically active fragment thereof, hasat least 70, 75, 80, 85, 90, 95, 96, 97, 98, or 99% sequence identity toSEQ ID NO:1 or SEQ ID NO:34, exhibits polymerase activity and exhibitsan improvement relative to a reference polymerase of SEQ ID NO:1 and/orSEQ ID NO:34, in one or more properties selected from thermostabilityand/or a sequencing property selected from read length, accuracy, strandbias, systematic error, and total sequencing throughput, and wherein themodified polymerase includes one or more amino acid substitutionsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A. In certain particular embodimentsthe sequencing property is analyzed using an emulsion PCR templateamplification reaction, which in especially illustrative embodimentsincludes 125 mM KCl or 125 mM NaCl, during sample preparation of anucleic acid sequencing reaction. In certain embodiments, the sequencingreaction under which a sequencing property of the modified polymerase isanalyzed is part of a next-generation (i.e. massively parallel, highthroughput) sequencing workflow (e.g., a workflow used in an Ion TorrentSystem, Illumina HiSeq or True Seq or X-10 system). In some embodiments,the sequencing workflow uses an ISFET based sensor. In certainembodiments, the sequencing property is analyzed using an Ion Torrent(Life Technologies, Carlsbad, Calif.) sequencing workflow and system, asexemplified herein. In certain aspects, the modified polymerase used inthe method has improved thermostability at 95° C. for 6 minutes ascompared to the thermostability of SEQ ID NO: 1 at 95° C. for 6 minutes.In certain illustrative embodiments of these aspects, the thermostablemodified polymerase, or the biologically active fragment thereof, usedin the method includes G418C or E397V. In yet further embodiments, inaddition to a G418C or in particular aspects, an E397V mutation, theisolated peptide further includes one or more amino acid substitutionsselected from the group consisting of E745T, L763F and E805I, whereinthe numbering is relative to SEQ ID NO: 1. In certain aspects, themethod includes a hot start, as is known in the PCR arts. In thesemethods of the invention that include a hot start, compositions in whichthe method is performed can include a reagent such as an oligonucleotideand/or an aptamer that is used for the hot start or the modifiedpolymerase can be chemically modified to provide a hot start mechanism.

In one embodiment of the invention one or more properties exhibited bythe modified polymerase or the biologically active fragment thereof usedin the method, include at least two, three, four, five, six, or allsequencing workflow properties selected from increased AQ20 mean readlength reads, reduced strand bias, increased base coverage, increasedaccuracy, increased sequencing throughput (Mb) and increased uniformityof coverage, relative to a reference polymerase having a sequence of SEQID NO: 34 and/or SEQ ID NO: 1. In some embodiments the modifiedpolymerase or biologically active fragment thereof, where one mutationis E397V, another mutation is P6N, E745T and/or L763F. In anotherembodiment, that may or may not include E397V, the mutations includeL763F and/or E805I, P6N and/or E295F, or E745T and/or E794C.

In a further embodiment of the invention the one or more properties ofthe mutant polymerase used in the method are exhibited when, or can bedetermined by performing an emulsion PCR template amplification reactionon a library constructed from a template having a GC content of 65%. Forthe sake of clarity, such steps are not part of the inventive method,but rather are for determining whether a modified polymerase meets thecriteria for a modified polymerase that is used in the method. Incertain embodiments the reference polymerase used for the polymerasecriteria testing is SEQ ID NO:34, and in certain particularlyillustrative embodiments, the reference polymerase is SEQ ID NO:1.

In one embodiment of the inventive method, the modified polymerase orbiologically active fragment thereof used in the method includes amutation selected from A77E, A97V, K240I, L287T, or K292C relative to areference polymerase having a sequence of SEQ ID NO:1. In exemplaryaspects of this embodiment, the one or more properties of the modifiedpolypeptide used in the method include a sequencing property analyzedusing a next-generation (i.e. massively parallel, high throughput)nucleic acid sequencing reaction where the modified polymerase orbiologically active fragment thereof is tested for such properties usingan emulsion PCR template amplification reaction on a library constructedfrom a template with a GC content of 65%.

In certain embodiments, the polymerase used in the method comprises 50,75, 100, 150, 175, 200, 250, 300, 350, 400, 500, 600, 700, or 800contiguous amino acid residues of SEQ ID NO:1 or SEQ ID NO:34 and has atleast 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99%, identity toSEQ ID NO: 1 or SEQ ID NO: 34, or a biologically active fragmentthereof, In certain embodiments the modified polymerase used in themethod includes SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5,SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10,SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO:15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ IDNO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29,SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments of the method for amplifying a nucleic acid,suitable conditions for performing the amplification are suitableconditions for performing a polymerase chain reaction, an isothermalamplification reaction, a recombinase polymerase amplification reaction,a proximity ligation amplification, a rolling circle amplification, astrand displacement amplification, or an emulsion polymerase chainreaction. Accordingly, in these embodiments, the method for amplifyingthe nucleic acid is one of the above listed methods for amplification.

In yet another embodiment, the method for amplifying a nucleic acid,includes clonally amplifying the nucleic acid in solution or on a solidsupport. In a further embodiment of the method includes determining thenucleic acid sequence of at least a portion of the nucleic acid. In someembodiments, the nucleic acid sequence can be determined using anynext-generation (i.e. massively parallel, high throughput) sequencingplatform (e.g., Ion Torrent Systems, Illumina HiSeq or True Seq or X-10systems). In some embodiments, the nucleic acid sequence can bedetermined using any ISFET based sequencing system.

In a further embodiment of the method the nucleic acid comprises atleast 65% GC content or at least 65% AT content.

Another embodiment of the invention is a method for performing a nucleicacid polymerization reaction including contacting a modified polymerase,or a biologically active fragment thereof, under suitable conditions fora polymerization reaction, with a nucleic acid template in the presenceof one or more nucleotide triphosphates, wherein the modified polymeraseor the biologically active fragment thereof, has at least 70, 75, 80,85, 90, 95, 96, 97, 98, or 99% sequence identity to SEQ ID NO:1 or SEQID NO:34, exhibits polymerase activity and exhibits an improvementrelative to a reference polymerase of SEQ ID NO:1 and/or SEQ ID NO:34,in one or more properties selected from thermostability and/or asequencing property selected from read length, accuracy, strand bias,systematic error, and total sequencing throughput, and wherein themodified polymerase includes one or more amino acid substitutionsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A. To analyze the sequencingproperties of the modified polymerase of the method of the invention, anemulsion PCR template amplification reaction can be used, and inparticular embodiments using conditions that include 125 mM KCl or 125mM NaCl, as sample preparation followed by a nucleic acid sequencingreaction. For the sake of clarity, the emulsion PCR templateamplification reaction and the nucleic acid sequence reaction recitedabove, are not steps of the method of this embodiment of the invention.Rather, they are part of a method that can be used to determine whethera polymerase is a modified polymerase used in the recited methodembodiment of the invention.

In certain embodiments, the sequencing workflow under which a sequencingproperty of the modified polymerase is analyzed is a next-generation(i.e. massively parallel, high throughput) sequencing workflow (e.g., aworkflow used in an Ion Torrent System, Illumina HiSeq or True Seq orX-10 system). In some embodiments, the sequencing workflow uses an ISFETbased sequencing system workflow. In certain embodiments, the sequencingproperty is analyzed using an Ion Torrent (Life Technologies, Carlsbad,Calif.) sequencing workflow and system, as exemplified herein. Incertain aspects, the modified polymerase used in the method has improvedthermostability at 95° C. for 6 minutes as compared to thethermostability of SEQ ID NO: 1 at 95° C. for 6 minutes. In certainillustrative embodiments of these aspects, the thermostable modifiedpolymerase, or the biologically active fragment thereof, used in themethod includes G418C or E397V. In yet further embodiments, in additionto a G418C or in particular aspects, an E397V mutation, the isolatedpeptide further includes one or more amino acid substitutions selectedfrom the group consisting of E745T, L763F and E805, wherein thenumbering is relative to SEQ ID NO: 1. In certain aspects, the methodincludes a hot start, as is known in the PCR arts. In these methods ofthe invention that include a hot start, compositions in which the methodis performed can include a reagent such as an oligonucleotide and/or anaptamer that is used for the hot start or the modified polymerase can bechemically modified to provide a hot start mechanism.

In one embodiment of the invention one or more properties exhibited bythe modified polymerase or the biologically active fragment thereof usedin the method, include at least two, three, four, five, six, or allsequencing workflow properties selected from increased AQ20 mean readlength reads, reduced strand bias, increased base coverage, increasedaccuracy, increased sequencing throughput (Mb) and increased uniformityof coverage, relative to a reference polymerase having a sequence of SEQID NO: 34 and/or SEQ ID NO: 1. In some embodiments the modifiedpolymerase or biologically active fragment thereof, where one mutationis E397V, another mutation is P6N, E745T and/or L763F. In anotherembodiment, that may or may not include E397V, the mutations includeL763F and/or E805I, P6N and/or E295F, or E745T and/or E794C.

In a further embodiment of the invention the one or more properties ofthe mutant polymerase used in the method are exhibited when, or can bedetermined by performing an emulsion PCR template amplification reactionon a library constructed from a template having a GC content of 65%. Forthe sake of clarity, such steps are not part of the inventive method,but rather are for determining whether a modified polymerase meets thecriteria for a modified polymerase that is used in the method. Incertain embodiments the reference polymerase used for the polymerasecriteria testing is SEQ ID NO:34, and in certain particularlyillustrative embodiments, the reference polymerase is SEQ ID NO:1.

In one embodiment of the inventive method, the modified polymerase orbiologically active fragment thereof used in the method includes amutation selected from A77E, A97V, K240I, L287T, or K292C relative to areference polymerase having a sequence of SEQ ID NO:1. In exemplaryaspects of this embodiment, the one or more properties of the modifiedpolypeptide used in the method include a sequencing property analyzedusing a next-generation (high throughput) nucleic acid sequencingreaction where the modified polymerase or biologically active fragmentthereof is tested for such properties using an emulsion PCR templateamplification reaction on a library constructed from a template with aGC content of 65%.

In certain embodiments, the polymerase used in the method comprises 50,75, 100, 150, 175, 200, 250, 300, 350, 400, 500, 600, 700, or 800contiguous amino acid residues of SEQ ID NO:1 or SEQ ID NO:34 and has atleast 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99%, identity toSEQ ID NO: 1 or SEQ ID NO: 34, or a biologically active fragmentthereof, In certain embodiments the modified polymerase used in themethod includes SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5,SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10,SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO:15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ IDNO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29,SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In yet another embodiment of the invention, provided herein is a methodfor obtaining sequence information from a nucleic acid template,includes: providing a reaction mixture, including the nucleic acidtemplate hybridized to a sequencing primer and bound to a modifiedpolymerase or a biologically active fragment thereof, contacting thetemplate nucleic acid with at least one type of nucleotide triphosphate,wherein the contacting includes incorporating one or more nucleotidesfrom the at least one type of nucleotide onto the 3′ end of thesequencing primer and generating an extended primer product; detectingthe presence of the extended primer product in the reaction mixture,thereby determining whether nucleotide incorporation has occurred; andidentifying at least one of the one or more nucleotides incorporatedfrom the at least one type of nucleotide triphosphate, wherein themodified polymerase or the biologically active fragment thereof, has atleast 70, 75, 80, 85, 90, 95, 96, 97, 98, or 99% sequence identity toSEQ ID NO:1 or SEQ ID NO:34, exhibits polymerase activity and exhibitsan improvement, relative to a reference polymerase of SEQ ID NO:1 and/orSEQ ID NO:34, in one or more properties selected from thermostabilityand/or a sequencing workflow property selected from read length,accuracy, strand bias, systematic error, and total sequencingthroughput, and the modified polymerase includes one or more amino acidsubstitutions selected from the group consisting of P6N, A77E, A97V,L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V,G418C, L490Q, A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W,V737A, E745T, L763F, E790G, E794C, E805I and L828A. The sequencingworkflow properties of the modified polymerase or the biologicallyactive fragment thereof, can be analyzed using an emulsion PCR templateamplification reaction, for example that includes 125 mM KCl or 125 mMNaCl, during sample preparation of a nucleic acid sequencing reaction.In certain embodiments, the method is a next-generation sequencingmethod. In some embodiments, the method uses an ISFET detection system.

In certain aspects, the modified polymerase used in the method hasimproved thermostability at 95° C. for 6 minutes as compared to thethermostability of SEQ ID NO: 1 at 95° C. for 6 minutes. In certainillustrative embodiments of these aspects, the thermostable modifiedpolymerase, or the biologically active fragment thereof, used in themethod includes G418C or E397V. In yet further embodiments, in additionto a G418C or in particular aspects, an E397V mutation, the isolatedpeptide further includes one or more amino acid substitutions selectedfrom the group consisting of E745T, L763F and E805I, wherein thenumbering is relative to SEQ ID NO: 1.

In one embodiment of the invention one or more properties exhibited bythe modified polymerase or the biologically active fragment thereof usedin the method, include at least two, three, four, five, six, or allsequencing workflow properties selected from increased AQ20 mean readlength reads, reduced strand bias, increased base coverage, increasedaccuracy, increased sequencing throughput (Mb) and increased uniformityof coverage, relative to a reference polymerase having a sequence of SEQID NO: 34 and/or SEQ ID NO: 1. In some embodiments the modifiedpolymerase or biologically active fragment thereof, where one mutationis E397V, another mutation is P6N, E745T and/or L763F. In anotherembodiment, that may or may not include E397V, the mutations includeL763F and/or E805I, P6N and/or E295F, or E745T and/or E794C.

In one embodiment of the inventive method, the modified polymerase orbiologically active fragment thereof used in the method includes amutation selected from A77E, A97V, K240I, L287T, or K292C relative to areference polymerase having a sequence of SEQ ID NO:1. In exemplaryaspects of this embodiment, the one or more properties of the modifiedpolypeptide used in the method include a sequencing property analyzedusing a next-generation (high throughput) nucleic acid sequencingreaction where the modified polymerase or biologically active fragmentthereof is tested for such properties using an emulsion PCR templateamplification reaction on a library constructed from a template with aGC content of 65%.

In certain embodiments, the polymerase used in the method comprises 50,75, 100, 150, 175, 200, 250, 300, 350, 400, 500, 600, 700, or 800contiguous amino acid residues of SEQ ID NO:1 or SEQ ID NO:34 and has atleast 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, and 99%, identity toSEQ ID NO: 1 or SEQ ID NO: 34, or a biologically active fragmentthereof, In certain embodiments the modified polymerase used in themethod includes SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5,SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10,SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO:15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ IDNO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29,SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In further aspects of the method the contacting, detecting andidentifying steps are repeated more than once, thereby identifying aplurality of sequential nucleotide incorporations, wherein at least oneof the nucleotides incorporated. In certain aspects, is a reversibleterminator nucleotide.

In another embodiment, provided herein is a kit with two or morevessels, where one vessel includes a component for performing a nucleicacid polymerization reaction, and another vessel comprises a modifiedpolymerase, or a biologically active fragment thereof, that has at least70, 75, 80, 85, 90, 95, 96, 97, 98, or 99% sequence identity to SEQ IDNO:1 or SEQ ID NO:34, that exhibits polymerase activity and an thatexhibits an improvement, relative to a reference polymerase of SEQ IDNO:1 and/or Seq ID NO:34, in one or more properties selected fromthermostability and/or a sequencing workflow property selected from readlength, accuracy, strand bias, systematic error, and total sequencingthroughput, and the modified polymerase includes one or more amino acidsubstitutions selected from the group consisting of P6N, A77E, A97V,L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V,G418C, L490Q, A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W,V737A, E745T, L763F, E790G, E794C, E805I and L828A. The properties canbe measured using an emulsion PCR template amplification reaction, whichin certain illustrative embodiments is performed in the presence of 125mM KCl or 125 mM NaCl, during sample preparation of a nucleic acidsequencing reaction, such as a high throughput or next-generationsequence reaction.

In further embodiments of the invention the kit includes nucleotidetri-phosphates, MgCl₂, and/or a buffer for a nucleic acid polymerizationreaction. The kit can further include a reagent for a hot startmechanism. In yet a further embodiment the kit includes a component forforming an emulsion.

In some embodiments, the disclosure relates generally to methods (andrelated kits, systems, apparatuses and compositions) for performing anucleotide polymerization reaction comprising or consisting ofcontacting a modified polymerase or a biologically active fragmentthereof with a nucleic acid template in the presence of one or morenucleotides, where the modified polymerase or the biologically activefragment thereof includes one or more amino acid modifications relativeto a reference polymerase, and where the modified polymerase or thebiologically active fragment thereof has improved accuracy, coverageand/or processivity as compared to the reference polymerase, andpolymerizing at least one of the one or more nucleotides using themodified polymerase or the biologically active fragment thereof.

In some embodiments, the disclosure relates generally to methods (andrelated kits, systems, apparatuses and compositions) for performing anucleotide polymerization reaction comprising or consisting ofcontacting a modified polymerase or a biologically active fragmentthereof with a nucleic acid template in the presence of one or morenucleotides, where the modified polymerase or the biologically activefragment thereof includes one or more amino acid modifications relativeto a reference polymerase, and where the modified polymerase or thebiologically active fragment thereof has an increased thermostabilityrelative to the reference polymerase, and polymerizing at least one ofthe one or more nucleotides using the modified polymerase or thebiologically active fragment thereof. In some embodiments, the methodincludes polymerizing at least one of the one or more nucleotides usingthe modified polymerase or the biologically active fragment thereof inthe presence of a high ionic strength solution. In some embodiments, ahigh ionic strength solution can include a solution in excess of 100 mMKCl. In some embodiments, a high ionic strength solution includes asolution that is at least 120 mM KCl. In some embodiments, a high ionicstrength solution includes a solution that is 125 mM to 200 mM KCl.

In some embodiments, the method can further include polymerizing one ofthe at least one nucleotides in a template-dependent fashion. In someembodiments, the polymerizing is performed under thermocyclingconditions. In some embodiments, the method can further includehybridizing a primer to the nucleic acid template prior to, during, orafter the contacting, and where the polymerizing includes polymerizingone of the at least one nucleotides onto an end of the primer using themodified polymerase or the biologically active fragment thereof. In someembodiments, the polymerizing is performed in the proximity of a sensorthat is capable of detecting the polymerization of the at least onenucleotide by the modified polymerase or the biologically activefragment thereof. In some embodiments, the method can further includedetecting a signal indicating the polymerization of the at least onenucleotide by the modified polymerase or the biologically activefragment thereof using a sensor. In some embodiments, the sensor is anISFET. In some embodiments, the sensor can include a detectable label ordetectable reagent within the polymerizing reaction.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 80% identity to SEQID NO: 1. In some embodiments, the modified polymerase or thebiologically active fragment thereof comprises or consists of at least100 contiguous amino acid residues having at least 90% identity to SEQID NO: 1, and wherein the modified polymerase or biological activefragment thereof has improved thermostability as compared to SEQ IDNO: 1. In some embodiments, the modified polymerase or the biologicallyactive fragment thereof comprises or consists of at least 150 contiguousamino acid residues having at least 90% identity to SEQ ID NO: 1,wherein the modified polymerase or biologically active fragment thereofhas improved thermostability as compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 95% identity to SEQ ID NO: 1, and whereinthe modified polymerase or biological active fragment thereof hasimproved thermostability as compared to SEQ ID NO: 1. In someembodiments, the modified polymerase or the biologically active fragmentthereof comprises or consists of at least 150 contiguous amino acidresidues having at least 95% identity to SEQ ID NO: 1, wherein themodified polymerase or biologically active fragment thereof has improvedthermostability as compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 98% identity to SEQ ID NO: 1, and whereinthe modified polymerase or biological active fragment thereof hasimproved thermostability as compared to SEQ ID NO: 1. In someembodiments, the modified polymerase or the biologically active fragmentthereof comprises or consists of at least 150 contiguous amino acidresidues having at least 98% identity to SEQ ID NO: 1, wherein themodified polymerase or biologically active fragment thereof has improvedthermostability as compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 90% identity to SEQID NO: 1. In some embodiments, the modified polymerase or thebiologically active fragment thereof comprises or consists of at least100 contiguous amino acid residues having at least 90% identity to SEQID NO: 1, and wherein the modified polymerase or biological activefragment thereof has improved accuracy as compared to SEQ ID NO: 1. Insome embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 150 contiguous aminoacid residues having at least 90% identity to SEQ ID NO: 1, wherein themodified polymerase or biologically active fragment thereof has improvedaccuracy as compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 95% identity to SEQ ID NO: 1, and whereinthe modified polymerase or biological active fragment thereof hasimproved accuracy as compared to SEQ ID NO: 1. In some embodiments, themodified polymerase or the biologically active fragment thereofcomprises or consists of at least 150 contiguous amino acid residueshaving at least 95% identity to SEQ ID NO: 1, wherein the modifiedpolymerase or biologically active fragment thereof has improved accuracyas compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 98% identity to SEQ ID NO: 1, and whereinthe modified polymerase or biological active fragment thereof hasimproved accuracy as compared to SEQ ID NO: 1. In some embodiments, themodified polymerase or the biologically active fragment thereofcomprises or consists of at least 150 contiguous amino acid residueshaving at least 98% identity to SEQ ID NO: 1, wherein the modifiedpolymerase or biologically active fragment thereof has improved accuracyas compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 90% identity to SEQ ID NO: 2, and whereinthe modified polymerase or biological active fragment thereof hasimproved thermostability as compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 150 contiguous aminoacid residues having at least 95% identity to SEQ ID NO: 2, wherein themodified polymerase or biologically active fragment thereof has improvedthermostability as compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 90% identity to SEQ ID NO: 3, and whereinthe modified polymerase or biological active fragment thereof hasimproved thermostability as compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 150 contiguous aminoacid residues having at least 95% identity to SEQ ID NO: 3, wherein themodified polymerase or biologically active fragment thereof has improvedthermostability as compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 90% identity to SEQ ID NO: 4, and whereinthe modified polymerase or biological active fragment thereof hasimproved thermostability as compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 150 contiguous aminoacid residues having at least 95% identity to SEQ ID NO: 4, wherein themodified polymerase or biologically active fragment thereof has improvedthermostability as compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 98% identity to SEQ ID NO: 2, SEQ ID NO: 3or SEQ ID NO: 4, and wherein the modified polymerase or biologicalactive fragment thereof has improved thermostability as compared to SEQID NO: 1. In some embodiments, the modified polymerase or thebiologically active fragment thereof comprises or consists of at least150 contiguous amino acid residues having at least 99% identity to SEQID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4, wherein the modified polymeraseor biologically active fragment thereof has improved thermostability ascompared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 98% identity to SEQ ID NO: 2, SEQ ID NO: 3or SEQ ID NO: 4, and wherein the modified polymerase or biologicalactive fragment thereof has improved accuracy as compared to SEQ IDNO: 1. In some embodiments, the modified polymerase or the biologicallyactive fragment thereof comprises or consists of at least 150 contiguousamino acid residues having at least 99% identity to SEQ ID NO: 2, SEQ IDNO: 3 or SEQ ID NO: 4, wherein the modified polymerase or biologicallyactive fragment thereof has improved accuracy as compared to SEQ ID NO:1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 90% identity to SEQ ID NO: 2, SEQ ID NO: 3or SEQ ID NO: 4, and wherein the modified polymerase or biologicalactive fragment thereof has improved thermostability as compared to SEQID NO: 34. In some embodiments, the modified polymerase or thebiologically active fragment thereof comprises or consists of at least150 contiguous amino acid residues having at least 95% identity to SEQID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4, and wherein the modifiedpolymerase or biological active fragment thereof has improvedthermostability as compared to SEQ ID NO: 34. In some embodiments, themodified polymerase or the biologically active fragment thereofcomprises or consists of at least 100 contiguous amino acid residueshaving at least 98% identity to SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO:4, and wherein the modified polymerase or biological active fragmentthereof has improved thermostability as compared to SEQ ID NO: 34. Insome embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 150 contiguous aminoacid residues having at least 99% identity to SEQ ID NO: 2, SEQ ID NO: 3or SEQ ID NO: 4, wherein the modified polymerase or biologically activefragment thereof has improved thermostability as compared to SEQ ID NO:34.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 90% identity to SEQ ID NO: 2, SEQ ID NO: 3or SEQ ID NO: 4, and wherein the modified polymerase or biologicalactive fragment thereof has improved accuracy as compared to SEQ ID NO:34. In some embodiments, the modified polymerase or the biologicallyactive fragment thereof comprises or consists of at least 150 contiguousamino acid residues having at least 95% identity to SEQ ID NO: 2, SEQ IDNO: 3 or SEQ ID NO: 4, and wherein the modified polymerase or biologicalactive fragment thereof has improved accuracy as compared to SEQ ID NO:34. In some embodiments, the modified polymerase or the biologicallyactive fragment thereof comprises or consists of at least 100 contiguousamino acid residues having at least 98% identity to SEQ ID NO: 2, SEQ IDNO: 3 or SEQ ID NO: 4, and wherein the modified polymerase or biologicalactive fragment thereof has improved accuracy as compared to SEQ ID NO:34. In some embodiments, the modified polymerase or the biologicallyactive fragment thereof comprises or consists of at least 150 contiguousamino acid residues having at least 99% identity to SEQ ID NO: 2, SEQ IDNO: 3 or SEQ ID NO: 4, wherein the modified polymerase or biologicallyactive fragment thereof has improved accuracy as compared to SEQ ID NO:34.

In some embodiments, the disclosure generally relates to methods (andrelated kits, apparatus, systems and compositions) for performingnucleic acid amplification comprising or consisting of generating anamplification reaction mixture having a modified polymerase or abiologically active fragment thereof, a primer, a nucleic acid template,and one or more nucleotides, where the modified polymerase or thebiologically active fragment thereof includes one or more amino acidmodifications relative to a reference polymerase and has improvedthermostability relative to the reference polymerase; and subjecting theamplification reaction mixture to amplifying conditions, where at leastone of the one or more nucleotides is polymerized onto the end of theprimer using the modified polymerase or the biologically active fragmentthereof. In some embodiments, the modified polymerase or thebiologically active fragment thereof having improved thermostabilityrelative to the reference polymerase (e.g., SEQ ID NO: 1 or SEQ ID NO:34), comprises or consists of at least 80% identity to SEQ ID NO: 5, SEQID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15,SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO:20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ IDNO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the modified polymerase or the biologically activefragment thereof having improved thermostability relative to thereference polymerase (e.g., SEQ ID NO: 1 or SEQ ID NO: 34), comprises orconsists of at least 90% identity to SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the modified polymerase or the biologically activefragment thereof having improved thermostability relative to thereference polymerase (e.g., SEQ ID NO: 1 or SEQ ID NO: 34), comprises orconsists of at least 95% identity to SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the modified polymerase or the biologically activefragment thereof having improved thermostability relative to thereference polymerase (e.g., SEQ ID NO: 1 or SEQ ID NO: 34), comprises orconsists of at least 98% identity to SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the modified polymerase or the biologically activefragment thereof having improved thermostability relative to thereference polymerase (e.g., SEQ ID NO: 1 or SEQ ID NO: 34), comprises orconsists of at least 99% identity to SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the disclosure generally relates to methods (andrelated kits, apparatus, systems and compositions) for performingnucleic acid amplification comprising or consisting of generating anamplification reaction mixture having a modified polymerase or abiologically active fragment thereof, a primer, a nucleic acid template,and one or more nucleotides, where the modified polymerase or thebiologically active fragment thereof includes one or more amino acidmodifications relative to a reference polymerase and has improvedaccuracy relative to the reference polymerase; and subjecting theamplification reaction mixture to amplifying conditions, where at leastone of the one or more nucleotides is polymerized onto the end of theprimer using the modified polymerase or the biologically active fragmentthereof. In some embodiments, the modified polymerase or thebiologically active fragment having improved accuracy relative to thereference polymerase (e.g., SEQ ID NO: 1 or SEQ ID NO: 34) comprises orconsists of at least 80% identity SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO:7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO:12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ IDNO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26,SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO:31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the method further includes determining theidentity of the one or more nucleotides polymerized by the modifiedpolymerase. In some embodiments, the method further includes determiningthe number of nucleotides polymerized by the modified polymerase. Insome embodiments, at least 50% of the one or more nucleotidespolymerized by the modified polymerase are identified. In someembodiments, substantially all of the one or more nucleotidespolymerized by the modified polymerase are identified. In someembodiments, the polymerization occurs in the presence of a high ionicstrength solution. In some embodiments the high ionic strength solutioncomprises 125 mM to 200 mM salt. In some embodiments, the polymerizationoccurs in the presence of an ionic strength solution of at least 120 mMsalt. In some embodiments, the high ionic strength solution comprisesKCl and/or NaCl.

In some embodiments, the modified polymerase or the biologically activefragment thereof having improved accuracy relative to the referencepolymerase (e.g., SEQ ID NO: 1 or SEQ ID NO: 34), comprises or consistsof at least 90% identity to SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7,SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12,SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO:17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ IDNO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31,SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the modified polymerase or the biologically activefragment thereof having improved accuracy relative to the referencepolymerase (e.g., SEQ ID NO: 1 or SEQ ID NO: 34), comprises or consistsof at least 95% identity to SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7,SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12,SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO:17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ IDNO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31,SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the modified polymerase or the biologically activefragment thereof having improved accuracy relative to the referencepolymerase (e.g., SEQ ID NO: 1 or SEQ ID NO: 34), comprises or consistsof at least 98% identity to SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7,SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12,SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO:17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ IDNO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31,SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the modified polymerase or the biologically activefragment thereof having improved accuracy relative to the referencepolymerase (e.g., SEQ ID NO: 1 or SEQ ID NO: 34), comprises or consistsof at least 99% identity to SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7,SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12,SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO:17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ IDNO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31,SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the modified polymerase or the biologically activefragment thereof further comprises at least 25 contiguous amino acids ofthe polymerase DNA binding domain. In some embodiments, the modifiedpolymerase or the biologically active fragment thereof comprises atleast 50 contiguous amino acid residues of the polymerase DNA bindingdomain. In some embodiments, the modified polymerase or the biologicallyactive fragment thereof comprises or consists of at least 100 contiguousamino acid residues of the polymerase DNA binding domain. In someembodiments, the modified polymerase or the biologically active fragmentthereof comprises or consists of at least 100 contiguous amino acidresidues of the polymerase DNA binding domain, while also having atleast 90% identity to SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ IDNO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ IDNO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22,SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO:27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ IDNO: 32 or SEQ ID NO: 33.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 150 contiguous aminoacid residues of the polymerase DNA binding domain having at least 95%identity to SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13,SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO:18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ IDNO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 orSEQ ID NO: 33. In some embodiments, the methods (and related kits,systems, apparatuses and compositions) include amplifying conditionshaving a high ionic strength solution. In one embodiment, a high ionicstrength solution is a solution having at least 120 mM KCl. In someembodiments, a high ionic strength solution includes a solution that is125 mM to 200 mM KCl.

In some embodiments, the disclosure generally relates to methods (andrelated kits, systems, apparatuses and compositions) for performing anucleotide polymerization reaction comprising or consisting of mixing amodified polymerase or a biologically active fragment thereof with anucleic acid template in the presence of one or more nucleotides, wherethe modified polymerase or the biologically active fragment thereofincludes one or more amino acid modifications relative to a referencepolymerase (such as SEQ ID NO: 1 or SEQ ID NO: 34; and polymerizing atleast one of the one or more nucleotides using the modified polymeraseor the biologically active fragment thereof in the mixture. In someembodiments, the modified polymerase or the biologically active fragmentthereof has increased accuracy as determined by measuring increasedaccuracy in the presence of a high ionic strength solution. In someembodiments, the high ionic strength solution refers to a reactionmixture for performing nucleotide polymerization having at least 120 mMKC. In some embodiments, a high ionic strength solution includes asolution that is 125 mM to 200 mM KCl.

In some embodiments, the methods (and related kits, apparatus, systemsand compositions) comprise a modified polymerase or a biologicallyactive fragment thereof comprising or consisting of at least 80%identity to SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13,SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO:18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ IDNO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 orSEQ ID NO: 33.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 90% identity to SEQID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ IDNO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19,SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO:24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ IDNO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 95% identity to SEQID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ IDNO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19,SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO:24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ IDNO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at 98% identity to SEQ ID NO:5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO:10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ IDNO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24,SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO:29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the disclosure generally relates to methods (andrelated kits, systems, apparatus and compositions) for detectingnucleotide incorporation comprising or consisting of performing anucleotide incorporation reaction using a modified polymerase or abiologically active fragment thereof having at least 90% identity to SEQID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 or SEQ ID NO: 33, a nucleic acid template, and oneor more nucleotide triphosphates; generating the nucleotideincorporation; and detecting the nucleotide incorporation. Detectingnucleotide incorporation can occur via any appropriate means such asPAGE, fluorescence, dPCR quantitation, nucleotide by-product production(e.g., hydrogen ion or pyrophosphate detection; suitable nucleotideby-product detection systems include without limitation, next-generationsequencing platforms such as Rain Dance, Roche 454, and Ion TorrentSystems)) or nucleotide extension product detection (e.g., opticaldetection of extension products or detection of labelled nucleotideextension products). In some embodiments, the methods (and related kits,systems, apparatus and compositions) for detecting nucleotideincorporation include or consist of detecting nucleotide incorporationusing a modified polymerase or a biologically active fragment thereofthat includes at least 95% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ IDNO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18,SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO:23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ IDNO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 orSEQ ID NO: 33. In some embodiments, the method of detecting nucleotideincorporation includes or consists of detecting nucleotide incorporationusing a modified polymerase or a biologically active fragment thereofthat includes at least 98% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ IDNO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18,SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO:23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ IDNO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 orSEQ ID NO: 33. In some embodiments, the method of detecting nucleotideincorporation includes or consists of detecting nucleotide incorporationby a modified polymerase or a biologically active fragment thereof thatincludes at least 99% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO:4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9,SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO:14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ IDNO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28,SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO:33. In some embodiments, the method further comprises determining theidentity of one or more nucleotides in the nucleotide incorporation. Insome embodiments, the byproduct of the nucleotide incorporation is ahydrogen ion. In some embodiments, the byproduct of the nucleotideincorporation is a pyrophosphate. In some embodiments, the byproduct ofthe nucleotide incorporation is a labeled nucleotide extension product.In some embodiments, the method of detecting nucleotide incorporationincludes generating the nucleotide incorporation under emulsion PCR orbridge PCR conditions.

In some embodiments, the disclosure generally relates to methods (andrelated kits, systems, apparatus and compositions) for detecting achange in ion concentration during a nucleotide polymerization reactioncomprising or consisting of performing a first nucleotide polymerizationreaction on a nucleic acid template or nucleic acid library in thepresence of one of more nucleotides to be incorporated during the firstnucleotide polymerization reaction, wherein the first nucleotidepolymerization reaction includes a modified polymerase or a biologicallyactive fragment thereof having at least 80% identity to SEQ ID NO: 2,SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7,SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12,SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO:17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ IDNO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31,SEQ ID NO: 32 or SEQ ID NO: 33; and performing a second nucleotidepolymerization reaction, wherein the second nucleotide polymerizationreaction detects at least one type of ion concentration change duringthe course of the second nucleotide polymerization reaction and providesa signal indicating a change in ion concentration of the at least onetype of ion. In some embodiments, the ion is a hydrogen ion. In someembodiments, the ion is a pyrophosphate ion. In some embodiments, thesignal indicating a change in ion concentration is a relative increasein the production of hydrogen ions in the polymerization reaction. Insome embodiments, detection of at least one type of ion concentrationchange is monitored using an ISFET. In some embodiments, the modifiedpolymerase or the biologically active fragment from the first nucleotidepolymerization reaction comprises or consists of at least 150 contiguousamino acid residues of a polymerase having at least 90% identity to SEQID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 or SEQ ID NO: 33. In some embodiments, themodified polymerase or the biological active fragment from the firstnucleotide polymerization reaction comprises or consists of at least 200contiguous amino acid resides of the polymerase having at least 95%identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15,SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO:20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ IDNO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33. In someembodiments, the modified polymerase or the biological active fragmentfrom the first nucleotide polymerization reaction comprises or consistsof at least 250 contiguous amino acid resides of the polymerase havingat least 98% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ IDNO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19,SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO:24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ IDNO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33. Insome embodiments, the modified polymerase or the biological activefragment from the first nucleotide polymerization reaction comprises orconsists of a polymerase having at least 99% identity to SEQ ID NO:2,SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ IDNO:8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ IDNO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22,SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO:27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ IDNO: 32 or SEQ ID NO: 33.

In some embodiments, the disclosure generally relates to methods (andrelated kits, systems, apparatus and compositions) for amplifying anucleic acid comprising or consisting of contacting a nucleic acid witha polymerase or a biologically active fragment thereof comprising atleast 80% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ IDNO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ IDNO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19,SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO:24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ IDNO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33under suitable conditions for amplification of the nucleic acid; andamplifying the nucleic acid. In some embodiments, the amplifying isperformed using a polymerase chain reaction, emulsion polymerase chainreaction, isothermal amplification reaction, recombinase polymeraseamplification reaction, proximity ligation amplification, rolling circleamplification or strand displacement amplification. In some embodiments,the amplifying includes clonally amplifying the nucleic acid insolution. In some embodiments, the amplifying includes clonallyamplifying the nucleic acid on a solid support such as a nucleic acidbead, flow cell, nucleic acid array, or wells present on the surface ofthe solid support. In some embodiments, the amplifying is performedusing a polymerase or biologically active fragment comprising athermostable DNA polymerase. In some embodiments, the polymerase orbiologically active fragment comprises a DNA polymerase having improvedthermostability as compared to a reference polymerase, such as SEQ IDNO: 1 or SEQ ID NO: 34. In some embodiments, the polymerase orbiologically active fragment comprises a DNA polymerase having improvedaccuracy as compared to a reference polymerase, such as SEQ ID NO: 1 orSEQ ID NO: 34.

In some embodiments, the methods (and related kits, systems, apparatusand compositions) for amplifying a nucleic acid comprising contacting anucleic acid with a polymerase or a biologically active fragment thereofcomprising at least 90% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ IDNO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ IDNO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18,SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO:23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ IDNO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 orSEQ ID NO: 33 under suitable conditions for amplification of the nucleicacid; and amplifying the nucleic acid. In some embodiments, thepolymerase or biologically active fragment comprises a DNA polymerasehaving an improved average read length as compared to the average readlength obtained using a DNA polymerase encoded by SEQ ID NO: 1 or SEQ IDNO: 34 under identical amplification conditions.

In some embodiments, the methods for amplifying a nucleic acid comprisecontacting a nucleic acid with a polymerase or a biologically activefragment thereof comprising at least 95% identity to SEQ ID NO: 2, SEQID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ IDNO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ IDNO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22,SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO:27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ IDNO: 32 or SEQ ID NO: 33 under suitable conditions for amplification ofthe nucleic acid; and amplifying the nucleic acid. In some embodiments,the method includes a polymerase or biologically active fragment havingan improved average read length as compared to the average read lengthobtained using a DNA polymerase encoded by SEQ ID NO: 1 or SEQ ID NO: 34under identical amplification conditions.

In some embodiments, the methods for amplifying a nucleic acid comprisecontacting a nucleic acid with a polymerase or a biologically activefragment thereof comprising at least 98% identity to SEQ ID NO: 2, SEQID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ IDNO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ IDNO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22,SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO:27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ IDNO: 32 or SEQ ID NO: 33 under suitable conditions for amplification ofthe nucleic acid; and amplifying the nucleic acid. In some embodiments,the method includes a polymerase or biologically active fragment havingan improved average read length as compared to the average read lengthobtained using a DNA polymerase encoded by SEQ ID NO: 1 or SEQ ID NO: 34under identical amplification conditions.

In some embodiments, the methods for amplifying a nucleic acid comprisecontacting a nucleic acid with a polymerase or a biologically activefragment thereof comprising at least 99% identity to SEQ ID NO: 2, SEQID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ IDNO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ IDNO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22,SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO:27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ IDNO: 32 or SEQ ID NO: 33 under suitable conditions for amplification ofthe nucleic acid; and amplifying the nucleic acid. In some embodiments,the method includes a polymerase or biologically active fragment havingan improved average read length as compared to the average read lengthobtained using a DNA polymerase encoded by SEQ ID NO: 1 or SEQ ID NO: 34under identical amplification conditions.

In some embodiments, average read length is determined by analyzing theread length of the amplified nucleic acids obtained using one or more ofthe modified polymerase provided herein, across all reads to establishan average read length and comparing the average read length to theaverage read length obtained using the reference polymerase.

In some embodiments, the disclosure generally relates to methods foramplifying a nucleic acid comprising or consisting of contacting anucleic acid with a polymerase or a biologically active fragment thereofcomprising at least 80% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ IDNO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ IDNO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18,SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO:23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ IDNO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 orSEQ ID NO: 33, under suitable conditions for amplification of thenucleic acid; and amplifying the nucleic acid. In some embodiments, theamplifying is performed by a polymerase or biologically active fragmenthaving improved templating efficiency as compared to a reference sample,such as SEQ ID NO: 1 or SEQ ID NO: 34. In some embodiments the methodfor amplifying a nucleic acid comprises amplifying the nucleic acidunder emulsion PCR conditions. In some embodiments the method foramplifying a nucleic acid comprises amplifying the nucleic acid underbridge PCR conditions. In some embodiments, the bridge PCR conditionsinclude hybridizing one or more of the amplified nucleic acids to asolid support. In some embodiments, the hybridized one or more amplifiednucleic acids can be used as a template for further amplification. Insome embodiments, the modified polymerase or biologically activefragment thereof comprises a polymerase that is derived from Thermusaquaticus DNA polymerase (Taq). SEQ ID NO: 1 is the full-length,wild-type, nucleic acid sequence of the DNA polymerase, Thermusaquaticus (Taq). In some embodiments, Taq DNA polymerase can be used asa reference polymerase in the methods, kits, apparatus, systems andcompositions described herein.

In some embodiments, the disclosure generally relates to methods (andrelated kits, systems, apparatus and compositions) for synthesizing anucleic acid comprising or consisting of incorporating at least onenucleotide onto the end of a primer using a modified polymerase or abiologically active fragment thereof having at least 90% identity to SEQID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 or SEQ ID NO: 33. Optionally, the method furthercomprises detecting incorporation of the at least one nucleotide ontothe end of the primer. In some embodiments, the method further includesdetermining the identity of at least one of the at least one nucleotideincorporated onto the end of the primer. In some embodiments, the methodcan include determining the identity of all nucleotides incorporatedonto the end of the primer. In some embodiments, the method includessynthesizing the nucleic acid in a template-dependent manner. In someembodiments, the method can include synthesizing the nucleic acid insolution, on a solid support, or in an emulsion (such as emPCR).

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%identity to SEQ ID NO: 1 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 1 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 95%identity to SEQ ID NO: 1 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 97%identity to SEQ ID NO: 1 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 98%identity to SEQ ID NO: 1 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 99%identity to SEQ ID NO: 1 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of SEQ ID NO: 2.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%identity to SEQ ID NO: 2 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 2 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 95%identity to SEQ ID NO: 2 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 97%identity to SEQ ID NO: 2 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 98%identity to SEQ ID NO: 2 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 99%identity to SEQ ID NO: 2 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of SEQ ID NO: 3.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%identity to SEQ ID NO: 3 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,E790G, E794C and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 3 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,E790G, E794C and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 95%identity to SEQ ID NO: 3 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,E790G, E794C and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 98%identity to SEQ ID NO: 3 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,E790G, E794C and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of SEQ ID NO: 4.

In some embodiments, disclosure is generally related to an isolated andpurified polypeptide comprising or consisting of at least 80% identityto SEQ ID NO: 4 and having one or more amino acid mutations selectedfrom the group consisting of P6N, A77E, A97V, L193V, K240I, R266Q,E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, A502S, S543V,D578E, R593G, L678F or L678T, S699W, E713W, V737A, E790G, E794C, E805Iand L828A.

In some embodiments, disclosure is generally related to an isolated andpurified polypeptide comprising or consisting of at least 90% identityto SEQ ID NO: 4 and having one or more amino acid mutations selectedfrom the group consisting of P6N, A77E, A97V, L193V, K240I, R266Q,E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, A502S, S543V,D578E, R593G, L678F or L678T, S699W, E713W, V737A, E790G, E794C, E805Iand L828A.

In some embodiments, disclosure is generally related to an isolated andpurified polypeptide comprising or consisting of at least 95% identityto SEQ ID NO: 4 and having one or more amino acid mutations selectedfrom the group consisting of P6N, A77E, A97V, L193V, K240I, R266Q,E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, A502S, S543V,D578E, R593G, L678F or L678T, S699W, E713W, V737A, E790G, E794C, E805Iand L828A.

In some embodiments, disclosure is generally related to an isolated andpurified polypeptide comprising or consisting of at least 98% identityto SEQ ID NO: 4 and having one or more amino acid mutations selectedfrom the group consisting of P6N, A77E, A97V, L193V, K240I, R266Q,E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, A502S, S543V,D578E, R593G, L678F or L678T, S699W, E713W, V737A, E790G, E794C, E805Iand L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of SEQ ID NO: 5, SEQID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15,SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO:20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ IDNO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 5 and having one or more amino acid mutationsselected from the group consisting of A77E, A97V, L193V, K240I, R266Q,E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 6 and having one or more amino acid mutationsselected from the group consisting of P6N, A97V, L193V, K240I, R266Q,E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 7 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, L193V, K240I, R266Q,E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 8 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, K240I, R266Q,E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 9 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, R266Q,E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 10 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 11 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 12 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 13 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, K292C, E295F or E295N, E397V, G418C, L490Q, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 14 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, E295F or E295N, E397V, G418C, L490Q, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 15 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295N, E397V, G418C, L490Q, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 16 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F, E397V, G418C, L490Q, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 17 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, L490Q, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 18 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 19 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 20 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 21 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 22 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 23 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 24 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 25 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 26 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, V737A, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 27 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, E745T, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 28 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, L763F,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 29 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 30 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 31 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 32 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 90%identity to SEQ ID NO: 33 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C and E805I.

In some embodiments, the disclosure is generally related to an isolatednucleic acid sequence comprising or consisting of a nucleic acidsequencing encoding a polypeptide having at least 80% identity to SEQ IDNO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the disclosure is generally related to an isolatednucleic acid sequence comprising or consisting of a nucleic acidsequencing encoding a polypeptide having at least 90% identity to SEQ IDNO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the disclosure is generally related to an isolatednucleic acid sequence comprising or consisting of a nucleic acidsequencing encoding a polypeptide having at least 95% identity to SEQ IDNO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the disclosure is generally related to an isolatednucleic acid sequence comprising or consisting of a nucleic acidsequencing encoding a polypeptide having at least 98% identity to SEQ IDNO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the disclosure is generally related to an isolatednucleic acid sequence comprising or consisting of a nucleic acidsequencing encoding a polypeptide having at least 99% identity to SEQ IDNO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the disclosure is generally related to acomposition comprising an isolated nucleic acid sequence comprising orconsisting of a nucleic acid sequence encoding a polypeptide having atleast 90% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ IDNO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ IDNO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19,SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO:24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ IDNO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the disclosure is generally related to acomposition comprising an isolated nucleic acid sequence comprising orconsisting of a nucleic acid sequence encoding a polypeptide having atleast 90% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ IDNO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ IDNO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19,SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO:24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ IDNO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33,and further comprising one or more amino acid mutations selected fromthe group consisting of P6N, A77E, A97V, L193V, K240I, R266Q, E267T,L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, A502S, S543V,D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G,E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to a vectorcomprising an isolated nucleic sequence encoding a polypeptide or abiologically active fragment thereof selected from the group consistingof SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQID NO:7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16,SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO:21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ IDNO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQID NO: 31, SEQ ID NO: 32 and SEQ ID NO: 33. In some embodiments, thevector comprising the isolated nucleic acid sequence encoding apolypeptide or biologically active fragment thereof includes a DNApolymerase. In some embodiments, the DNA polymerase is a Thermusaquaticus (Taq) polymerase. In some embodiments, the DNA polymerase is athermostable DNA polymerase. In some embodiments, the DNA polymerase isderived from a thermostable Thermus aquaticus (Taq) polymerase.

In some embodiments, the disclosure is generally related to acomposition comprising an isolated polypeptide having at least 80%identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15,SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO:20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ IDNO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the disclosure is generally related to acomposition comprising an isolated polypeptide having at least 90%identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15,SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO:20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ IDNO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the disclosure is generally related to acomposition comprising an isolated polypeptide having at least 95%identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15,SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO:20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ IDNO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the disclosure is generally related to acomposition comprising an isolated polypeptide having at least 98%identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15,SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO:20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ IDNO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the disclosure is generally related to acomposition comprising an isolated polypeptide having at least 99%identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15,SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO:20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ IDNO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the disclosure is generally related to acomposition comprising an isolated nucleic acid having at least 80%identity to SEQ ID NO: 1 and further comprising at least one amino acidsubstitution selected from the group consisting of P6, A77, A97, L193,K240, R266, E267, L287, P291, K292, E295, E397, G418, L490, A502, S543,D578, R593, L678, S699, E713, V737, E745, L763, E790, E794, E805 andL828, wherein the numbering is specific to amino acid residues of SEQ IDNO: 1.

In some embodiments, the disclosure is generally related to acomposition comprising an isolated nucleic acid having at least 80%identity to SEQ ID NO: 1 and further comprising at least one amino acidsubstitution selected from the group consisting of P6N, A77E, A97V,L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V,G418C, L490Q, A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W,V737A, E745T, L763F, E790G, E794C, E805I and L828A, wherein thenumbering is specific to amino acid residues of SEQ ID NO: 1.

In some embodiments, the composition comprises at least 80% identity toSEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6,SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11,SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO:16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ IDNO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30,SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33 and further comprises atleast one amino acid substitution selected from the group consisting ofP6, A77, A97, L193, K240, R266, E267, L287, P291, K292, E295, E397,G418, L490, A502, S543, D578, R593, L678, S699, E713, V737, E745, L763,E790, E794, E805 and L828, wherein the numbering is specific to aminoacid residues of SEQ ID NO: 1.

In some embodiments, the composition comprises at least 80% identity toSEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6,SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11,SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO:16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ IDNO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30,SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33 and further comprises atleast one amino acid substitution selected from the group consisting ofP6N, A77E, A97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295For E295N, E397V, G418C, L490Q, A502S, S543V, D578E, R593G, L678F orL678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A,wherein the numbering is specific to amino acid residues of SEQ ID NO:1.

In some embodiments, the composition comprises or consists of SEQ ID NO:2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7,SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12,SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO:17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ IDNO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31,SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the composition comprises at least 85%, 90%, 95%,98% or 99% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ IDNO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ IDNO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19,SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO:24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ IDNO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33,and further comprising at least one amino acid substitution selectedfrom the group consisting of P6N, A77E, A97V, L193V, K240I, R266Q,E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A, wherein the numbering is specific toamino acid residues of SEQ ID NO: 1.

In some embodiments, the disclosure is generally related to a kitcomprising an isolated polypeptide having at least 80% identity to SEQID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 or SEQ ID NO: 33. In some embodiments, the kitcomprises an isolated polypeptide having at least 90%, 95%, 96%, 97% 98%or 99% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO:5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO:10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ IDNO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24,SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO:29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the kit comprises an isolated polypeptide selectedfrom the group consisting of SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ IDNO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19,SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO:24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ IDNO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the kit comprises an isolated polypeptidecomprising or consisting of at least 250 contiguous amino acid residueshaving at least 90% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO:4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9,SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO:14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ IDNO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28,SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO:33.

In some embodiments, the kit comprises an isolated polypeptidecomprising or consisting of at least 450 contiguous amino acid residueshaving at least 95% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO:4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9,SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO:14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ IDNO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28,SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO:33.

In some embodiments, the kit comprises an isolated polypeptidecomprising or consisting of at least 650 contiguous amino acid residueshaving at least 98% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO:4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9,SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO:14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ IDNO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28,SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO:33. In some embodiments, the kits further comprise dNTPs, one of morebuffers and/or MgCl.

In some embodiments, the disclosure generally relates to a polymerase ora biologically active fragment thereof having DNA polymerase activityand at least 80% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4,SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9,SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO:14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ IDNO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28,SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 and SEQ IDNO: 33, wherein the polymerase or the biologically active fragmenthaving DNA polymerase activity includes at least one amino acidsubstitution as compared to SEQ ID NO: 1 or SEQ ID NO: 34.

In some embodiments, the at least one amino acid substitution ascompared to SEQ ID NO: 1 or SEQ ID NO: 34 can impart a beneficialproperty to the polymerase or biologically active fragment thereof. Insome embodiments, the beneficial property imparted to the polymerase orbiologically active fragment thereof (as compared to SEQ ID NO: 1 or SEQID NO: 34) includes improved thermostability, improved read length,improved templating efficiency, improved performance in a high ionicstrength solution or improved accuracy. In some embodiments, thebeneficial property imparted to the polymerase or biologically activefragment thereof (as compared to SEQ ID NO: 1 or SEQ ID NO: 34) includesreduced strand bias of GC and AT rich nucleic acids. It will begenerally understood that the beneficial property imparted to thepolymerase or biological fragment (as compared to the properties of SEQID NO: 1 or SEQ ID NO: 34) can be determined by assessing and/ormeasuring such properties under identical conditions (e.g., comparingthe properties of SEQ ID NO: 1 against the polymerase or biologicallyactive fragment thereof, under identical conditions). For example, theaccuracy of a DNA polymerase can be measured in terms of the longestperfect read (typically measured in terms of the number of nucleotidescorrectly included in the read) obtained from a nucleotidepolymerization reaction. In some embodiments, the nucleotidepolymerization reaction can be conducted using emulsion PCR, bridge PCRor hot-start PCR conditions. In some embodiments, one or more of thebeneficial properties imparted to the polymerase or biologically activefragment thereof can be determined by assessing sequencing accuracy. Insome embodiments, sequencing accuracy can be determined using anynext-generation sequencing platform (e.g., Ion Torrent Systems, IlluminaHiSeq or True Seq or X-10 systems). In some embodiments, sequencingaccuracy can be determined using any ISFET based sequencing system.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 90% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ IDNO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18,SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO:23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ IDNO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 orSEQ ID NO: 33 and further comprises at least one amino acid substitutionselected from the group consisting of P6, A77, A97, L193, K240, R266,E267, L287, P291, K292, E295, E397, G418, L490, A502, S543, D578, R593,L678, S699, E713, V737, E745, L763, E790, E794, E805 and L828, whereinthe numbering is relative to SEQ ID NO: 34.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of a fragment of SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4,SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9,SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO:14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ IDNO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28,SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO:33 that retains polymerase activity. In some embodiments, the polymeraseactivity, also referred to herein as polymerase properties or polymerasecharacteristics is selected from primer extension activity, stranddisplacement activity, proofreading activity, nick-initiated polymeraseactivity, reverse transcriptase activity accuracy, average read length,thermostability, processivity, strand bias or nucleotide polymerizationactivity. In some embodiments, the polymerase activity is selected fromone or more sequencing based metrics selected from raw read accuracy,average read length, thermostability or processivity.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of a biologically active fragment of SEQ ID NO: 2, SEQ ID NO:3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8,SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO:13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ IDNO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27,SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO:32 or SEQ ID NO: 33 having polymerase activity selected from improvedread length, improved accuracy or improved thermostability as comparedto polymerase activity of SEQ ID NO: 1 or SEQ ID NO: 34 under identicalconditions. In some embodiments, the polymerase activity is determinedin the presence of a high ionic strength solution. In some embodimentsthe high ionic strength solution is at least 120 mM Kcl. In someembodiments, the high ionic strength solution is from 125 mM KCl to 200mM KCl.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 80% identity to SEQ ID NO: 1 and furthercomprising at least one amino acid substitution selected from the groupconsisting of A97, K240, L287 and K292, wherein the numbering isrelative to SEQ ID NO: 1. In some embodiments, the disclosure generallyrelates to a substantially purified polymerase having an amino acidsequence comprising or consisting of at least 80% identity to SEQ ID NO:1 and further comprising at least one amino acid substitution selectedfrom the group consisting of A97V, K240I, L287T and K292C, wherein thenumbering is relative to SEQ ID NO: 1.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 90% identity to SEQ ID NO: 1 and furthercomprising a E397 amino acid substitution, wherein the numbering isrelative to SEQ ID NO: 1. In some embodiments, the disclosure generallyrelates to a substantially purified polymerase having an amino acidsequence comprising or consisting of at least 90% identity to SEQ ID NO:1 and further comprising a E397V amino acid substitution, wherein thenumbering is relative to SEQ ID NO: 1.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 90% identity to SEQ ID NO: 1 and furthercomprising a L763 amino acid substitution, wherein the numbering isrelative to SEQ ID NO: 1. In some embodiments, the disclosure generallyrelates to a substantially purified polymerase having an amino acidsequence comprising or consisting of at least 90% identity to SEQ ID NO:1 and further comprising a L763F amino acid substitution, wherein thenumbering is relative to SEQ ID NO: 1.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 90% identity to SEQ ID NO: 1 and furthercomprising a E805 amino acid substitution, wherein the numbering isrelative to SEQ ID NO: 1. In some embodiments, the disclosure generallyrelates to a substantially purified polymerase having an amino acidsequence comprising or consisting of at least 90% identity to SEQ ID NO:1 and further comprising a E805I amino acid substitution, wherein thenumbering is relative to SEQ ID NO: 1.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 90% identity to SEQ ID NO: 1 and furthercomprising a E745 amino acid substitution, wherein the numbering isrelative to SEQ ID NO: 1. In some embodiments, the disclosure generallyrelates to a substantially purified polymerase having an amino acidsequence comprising or consisting of at least 90% identity to SEQ ID NO:1 and further comprising a E745T amino acid substitution, wherein thenumbering is relative to SEQ ID NO: 1.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 90% identity to SEQ ID NO: 34 and furthercomprising a E397 amino acid substitution, wherein the numbering isrelative to SEQ ID NO: 34. In some embodiments, the disclosure generallyrelates to a substantially purified polymerase having an amino acidsequence comprising or consisting of at least 90% identity to SEQ ID NO:34 and further comprising a E397V amino acid substitution, wherein thenumbering is relative to SEQ ID NO: 34.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 90% identity to SEQ ID NO: 34 and furthercomprising a L763 amino acid substitution, wherein the numbering isrelative to SEQ ID NO: 34. In some embodiments, the disclosure generallyrelates to a substantially purified polymerase having an amino acidsequence comprising or consisting of at least 90% identity to SEQ ID NO:34 and further comprising a L763F amino acid substitution, wherein thenumbering is relative to SEQ ID NO: 34.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 90% identity to SEQ ID NO: 34 and furthercomprising a E805 amino acid substitution, wherein the numbering isrelative to SEQ ID NO: 34. In some embodiments, the disclosure generallyrelates to a substantially purified polymerase having an amino acidsequence comprising or consisting of at least 90% identity to SEQ ID NO:34 and further comprising a E805 amino acid substitution, wherein thenumbering is relative to SEQ ID NO: 34.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 90% identity to SEQ ID NO: 34 and furthercomprising a E745 amino acid substitution, wherein the numbering isrelative to SEQ ID NO: 34. In some embodiments, the disclosure generallyrelates to a substantially purified polymerase having an amino acidsequence comprising or consisting of at least 90% identity to SEQ ID NO:34 and further comprising a E745T amino acid substitution, wherein thenumbering is relative to SEQ ID NO: 34.

In some embodiments, the disclosure relates generally to a compositioncomprising a recombinant polymerase homologous to SEQ ID NO: 1 or abiologically active fragment thereof having at least 90% identity to SEQID NO: 1, wherein the recombinant polymerase comprises a mutation orcombination of mutations relative to SEQ ID NO: 1 selected from P6N,A77E, A97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F orE295N, E397V, G418C, L490Q, A502S, S543V, D578E, R593G, L678F or L678T,S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I or L828A. In someembodiments, the recombinant polymerase homologous to SEQ ID NO: 1 or abiologically active fragment thereof includes a thermostable DNApolymerase from a species other than Thermus aquaticus (Taq).

In some embodiments, the disclosure relates generally to a compositioncomprising a recombinant polymerase homologous to SEQ ID NO: 34 or abiologically active fragment thereof having at least 90% identity to SEQID NO: 34, wherein the recombinant polymerase comprises a mutation orcombination of mutations relative to SEQ ID NO: 34 selected from P6N,A77E, A97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F orE295N, E397V, G418C, L490Q, A502S, S543V, D578E, R593G, L678F or L678T,S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I or L828A. In someembodiments, the recombinant polymerase homologous to SEQ ID NO: 34 or abiologically active fragment thereof includes a thermostable DNApolymerase from a species other than Thermus aquaticus (Taq). In someembodiments, the recombinant polymerase homologous to SEQ ID NO: 1 orSEQ ID NO: 34 includes a thermostable polymerase selected from the groupconsisting of Klentaq-235 DNA polymerase, Klentaq-278 DNA polymerase,Stoffel fragment, Klentaq-291 DNA polymerase, Pyrococcus furiosus DNApolymerase, Pyrococcus GB-D DNA polymerase, Thermus flavus DNApolymerase, Thermus thermophilus DNA polymerase, Thermococcus literalisDNA polymerase and a combination thereof.

In some embodiments, the disclosure relates generally to a compositioncomprising a recombinant polymerase homologous to SEQ ID NO: 34 or abiologically active fragment thereof having at least 80% identity to SEQID NO: 34 or a biologically active fragment thereof and where therecombinant polymerase comprises an E397 mutation. In some embodiments,the recombinant polymerase homologous to SEQ ID NO: 34 comprises amutation that increases processivity, increases accuracy, increasesaverage read length or improves thermostability, as compared to areference polymerase lacking the corresponding mutation. In someembodiments, the increased processivity, increased accuracy, increasedaverage read length, or improved thermostability is measured using anISFET. In some embodiments, the ISFET is coupled to a semiconductorbased sequencing platform. In some embodiments, the semiconductor basedsequencing platform is a Personal Genome Machine or a Proton Sequencer(Life Technologies Corp., CA).

In some embodiments, the disclosure relates generally to a compositioncomprising a recombinant polymerase homologous to SEQ ID NO: 34 or abiologically active fragment thereof having at least 80% identity to SEQID NO: 34 or a biologically active fragment thereof and where therecombinant polymerase comprises a mutation or combination of mutationsrelative to SEQ ID NO: 34 selected from E397V, and where the polymerasefurther includes a mutation at one or more of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, E713W, V737A, E745T, L763F,E790G, E794C, E805I and L828A, wherein the numbering is relative to SEQID NO: 34.

In some embodiments, the disclosure relates generally to a compositioncomprising a recombinant polymerase homologous to SEQ ID NO: 34 or abiologically active fragment thereof having at least 80% identity to SEQID NO: 34 or a biologically active fragment thereof and where therecombinant polymerase comprises a mutation or combination of mutationsrelative to SEQ ID NO: 34 selected from E397V, L763F, E805I and E745T,wherein the numbering is relative to SEQ ID NO: 34.

In some embodiments, the disclosure relates generally to a compositioncomprising a recombinant polymerase homologous to SEQ ID NO: 34 or abiologically active fragment thereof having at least 80% identity to SEQID NO: 34 or a biologically active fragment thereof and where therecombinant polymerase comprises a mutation or combination of mutationsrelative to SEQ ID NO: 34 selected from E397V, L763F, E805I and E745T,and where the polymerase further includes a mutation at one or more ofP6N, A77E, A97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295For E295N, G418C, L490Q, A502S, S543V, D578E, R593G, L678F or L678T,E713W, V737A, E790G, E794C, and L828A, wherein the numbering is relativeto SEQ ID NO: 34.

In some embodiments, the recombinant polymerase homologous to SEQ ID NO:1 or SEQ ID NO: 34 or the biologically active fragment thereof comprisesincreased accuracy as compared to a reference polymerase lacking amutation or combination of mutations relative to SEQ ID NO: 1 or SEQ IDNO: 34; or increased read length as compared to a reference polymeraselacking a mutation or combination of mutations relative to SEQ ID NO: 1or SEQ ID NO: 34; or increased total sequencing throughput as comparedto a reference polymerase lacking a mutation or combination of mutationsrelative to the recombinant polymerase homologous to SEQ ID NO: 1 or SEQID NO: 34; or reduced strand bias as compared to a reference polymeraselacking a mutation or combination of mutations relative to SEQ ID NO: 1or SEQ ID NO 34. In some embodiments, the increased accuracy, increasedread length, increased sequencing throughput or reduced strand bias ismeasured using an ISFET. In some embodiments, the ISFET is coupled to asemiconductor based sequencing platform. In some embodiments, thesemiconductor based sequencing platform is a Personal Genome Machine ora Proton Sequencer available from Life Technologies Corp., (CA).

In some embodiments, the disclosure relates generally to a compositioncomprising a polymerase or a biologically active fragment thereof havingat least 80% identity to SEQ ID NO: 1 or SEQ ID NO: 34, where thepolymerase or biologically active fragment thereof improves sequencingcoverage of a GC rich genome, wherein the GC rich genome is at least60%, 65%, 70%, 75%, 80%, 85% or more GC rich. In some embodiments, theGC rich genome is derived or obtained from a GC rich organism e.g.,bacterial genomes such as Rhodococcus and the like. In some embodiments,the polymerase or biologically active fragment thereof improvessequencing of a GC rich genome such that upon nucleic acid sequencingthe data includes less than 100 nucleic acid gaps per gigabyte ofnucleic acid sequencing data. In some embodiments, the polymerase orbiologically active fragment thereof having at least 80% identity to SEQID NO: 1 or SEQ ID NO: 34 further includes one or more amino acidsubstitutions relative to SEQ ID NO: 1 or SEQ ID No: 34. In someembodiments, the one or more amino acid substitutions relative to SEQ IDNO: 1 or SEQ ID No: 34 are selected from the group consisting of P6,A77, A97, L193, K240, R266, E267, L287, P291, K292, E295, E397, G418,L490, A502, S543, D578, R593, L678, S699, E713, V737, E745, L763, E790,E794, E805 or L828, wherein the numbering is relative to SEQ ID NO: 1.It will be apparent to one of ordinary skill in the art that anyappropriate method to determine GC content is considered sufficient. Forexample, GC content can be measured by determining the meltingtemperature of the DNA double helix using spectrophotometry. Theabsorbance of DNA at 260 nm increases significantly when double-strandedDNA separates to form two single-strands. Other suitable methods todetermine GC content include calculating the expected meltingtemperature using a single GC calculator or using flow cytometry todetermine GC ratio's when a large number of samples.

In some embodiments, the disclosure relates generally to a compositioncomprising a polymerase or a biologically active fragment thereof havingat least 80% identity to SEQ ID NO: 1 or SEQ ID NO: 34, where thepolymerase or biologically active fragment thereof improves sequencingcoverage of a GC rich genome, wherein the GC rich genome is at least60%, 65%, 70%, 75%, 80%, 85% or more GC rich. In some embodiments, thepolymerase or biologically active fragment thereof improves sequencingof a GC rich genome such that upon nucleic acid sequencing the dataincludes less than 50 nucleic acid gaps per gigabyte of nucleic acidsequencing data. In some embodiments, the polymerase or biologicallyactive fragment thereof having at least 80% identity to SEQ ID NO: 1 orSEQ ID NO: 34 further includes one or more amino acid substitutionsrelative to SEQ ID NO: 1 or SEQ ID No: 34. In some embodiments, the oneor more amino acid substitutions relative to SEQ ID NO: 1 or SEQ ID No:34 are selected from the group consisting of P6, A77, A97, L193, K240,R266, E267, L287, P291, K292, E295, E397, G418, L490, A502, S543, D578,R593, L678, S699, E713, V737, E745, L763, E790, E794, E805 or L828,wherein the numbering is relative to SEQ ID NO: 1.

In some embodiments, the disclosure relates generally to a compositioncomprising a polymerase or a biologically active fragment thereof havingat least 80% identity to SEQ ID NO: 1 or SEQ ID NO: 34, where thepolymerase or biologically active fragment thereof improves sequencingcoverage of a GC rich genome, wherein the GC rich genome is at least60%, 65%, 70%, 75%, 80%, 85% or more GC rich. In some embodiments, thepolymerase or biologically active fragment thereof improves sequencingof a GC rich genome such that upon nucleic acid sequencing the dataincludes less than 20 nucleic acid gaps per gigabyte of nucleic acidsequencing data. In some embodiments, the polymerase or biologicallyactive fragment thereof having at least 80% identity to SEQ ID NO: 1 orSEQ ID NO: 34 further includes one or more amino acid substitutionsrelative to SEQ ID NO: 1 or SEQ ID No: 34. In some embodiments, the oneor more amino acid substitutions relative to SEQ ID NO: 1 or SEQ ID No:34 are selected from the group consisting of P6, A77, A97, L193, K240,R266, E267, L287, P291, K292, E295, E397, G418, L490, A502, S543, D578,R593, L678, S699, E713, V737, E745, L763, E790, E794, E805 or L828,wherein the numbering is relative to SEQ ID NO: 1.

In some embodiments, the disclosure relates generally to a compositioncomprising a polymerase or a biologically active fragment thereof havingat least 80% identity to SEQ ID NO: 1 or SEQ ID NO: 34, where thepolymerase or biologically active fragment thereof improves sequencingcoverage of a AT rich genome, wherein the AT rich genome is at least60%, 65%, 70%, 75%, 80% or more AT rich. In some embodiments, thepolymerase or biologically active fragment thereof improves sequencingof an AT rich genome such that upon sequencing the data includes lessthan 100 nucleic acid gaps per gigabyte of nucleic acid sequencing data.In some embodiments, the polymerase or biologically active fragmentthereof having at least 80% identity to SEQ ID NO: 1 or SEQ ID NO: 34further includes one or more amino acid substitutions relative to SEQ IDNO: 1 or SEQ ID No: 34. In some embodiments, the one or more amino acidsubstitutions relative to SEQ ID NO: 1 or SEQ ID No: 34 are selectedfrom the group consisting of P6, A77, A97, L193, K240, R266, E267, L287,P291, K292, E295, E397, G418, L490, A502, S543, D578, R593, L678, S699,E713, V737, E745, L763, E790, E794, E805 or L828, wherein the numberingis relative to SEQ ID NO: 1.

In some embodiments, the disclosure relates generally to a compositioncomprising a polymerase or a biologically active fragment thereof havingat least 80% identity to SEQ ID NO: 1 or SEQ ID NO: 34, where thepolymerase or biologically active fragment thereof improves sequencingcoverage of a AT rich genome, wherein the AT rich genome is at least60%, 70%, 75% or 80% AT rich. In some embodiments, the polymerase orbiologically active fragment thereof improves sequencing of an AT richgenome such that upon sequencing the data includes less than 50 nucleicacid gaps per gigabyte of nucleic acid sequencing data.

In some embodiments, the disclosure relates generally to a compositioncomprising a polymerase or a biologically active fragment thereof havingat least 80% identity to SEQ ID NO: 1 or SEQ ID NO: 34, where thepolymerase or biologically active fragment thereof improves sequencingcoverage of a AT rich genome, wherein the AT rich genome is at least60%, 70%, 75% or 80% AT rich. In some embodiments, the polymerase orbiologically active fragment thereof improves sequencing of an AT richgenome such that upon sequencing the data includes less than 20 nucleicacid gaps per gigabyte of nucleic acid sequencing data.

In some embodiments, the disclosure relates generally to a method forperforming nucleic acid amplification comprising or consisting ofcontacting a modified polymerase with a nucleic acid template in thepresence of one or more nucleotides, where the modified polymeraseincludes one or more amino acid substitutions relative to SEQ ID NO: 1or SEQ ID NO: 34 and has an increased accuracy relative to SEQ ID NO: 1or SEQ ID NO: 34, and polymerizing at least one of the one or morenucleotides using the modified polymerase. In some embodiments, themodified polymerase or the biologically active fragment thereofcomprises or consists of at least 150 contiguous amino acid residueshaving at least 80% identity SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4,SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9,SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO:14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ IDNO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28,SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO:33.

In some embodiments, the disclosure relates generally to a method forobtaining sequence information from a nucleic acid template comprisingproviding a reaction mixture including a template nucleic acidhybridized to a sequencing primer and bound to a modified polymerase;contacting the template nucleic acid with at least one type ofnucleotide triphosphate, wherein the contacting includes incorporatingone or more nucleotides from the at least one type of nucleotide ontothe 3′ end of the sequencing primer and generating an extended primerproduct; detecting the presence of the extended primer product in thereaction mixture, thereby determining whether nucleotide incorporationhas occurred; and identifying at least one of the one or morenucleotides incorporated from the at least one type of nucleotide. Insome embodiments, the method includes a modified polymerase comprisingan isolated polypeptide having at least 80% identity to SEQ ID NO: 1and/or SEQ ID NO: 34, wherein the modified polymerase includes one ormore amino acid substitutions selected from the group consisting of P6,A77, A97, L193, K240, R266, E267, L287, P291, K292, E295, E397, G418,L490, A502, S543, D578, R593, L678, S699, E713, V737, E745, L763, E790,E794, E805 or L828, wherein the numbering is relative to SEQ ID NO: 1.In some embodiments, the method can include a modified polymerasecomprising an isolated polypeptide having at least 80% identity to SEQID NO: 1 and/or SEQ ID NO: 34, wherein the modified polymerase includesone or more amino acid substitutions selected from the group consistingP6N, A77E, A97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295For E295N, E397V, G418C, L490Q, A502S, S543V, D578E, R593G, L678F orL678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I or L828A,wherein the numbering is relative to SEQ ID NO: 1. In some embodiments,the method can include contacting, detecting and identifying steps thatare repeated more than once, thereby identifying a plurality ofsequential nucleotide incorporations. In some embodiments, the methodcan include incorporating one or more reversible terminators and/ornucleotide analogs. In some embodiments, the method can includeincorporating at least one dNTP (such as dATP, dTTP, dGTP or dCTP).

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying drawings, which are incorporated into and form a partof the specification, illustrate one or more exemplary embodiments andserve to explain the principles of various exemplary embodiments. Thedrawings are exemplary and explanatory only and are not to be construedas limiting or restrictive in any way.

FIGS. 1A-1E show a graph providing exemplary sequencing throughput andmean read length data obtained using exemplary modified polymerasesaccording to the disclosure.

FIGS. 2A1, 2A2, 2B1, 2B2 area table and a chart providing exemplarynucleic acid sequencing data obtained using exemplary modifiedpolymerases according to the disclosure.

FIGS. 3A-3B are a table providing exemplary nucleic acid sequencing dataobtained for modified polymerases obtained according to the disclosure,as compared to a reference polymerase (SEQ ID NO: 34).

FIG. 4 is a table providing exemplary nucleic acid sequencing data withrespect to GC content obtained using an exemplary modified polymeraseaccording to the disclosure (SEQ ID NO: 2).

FIGS. 5A-5B are a table and a chart providing exemplary nucleic acidsequencing data obtained using exemplary modified polymerases accordingto the disclosure, as compared to a reference polymerase (SEQ ID NO:34).

FIG. 6 is a table providing exemplary nucleic acid sequencing dataobtained using exemplary modified polymerases according to thedisclosure, as compared to a reference polymerase (SEQ ID NO:34).

FIG. 7 shows a graph providing exemplary thermostability data obtainedusing an exemplary modified polymerase according to the disclosure.

FIG. 8 shows a graph providing exemplary thermostability data obtainedusing an exemplary modified polymerase according to the disclosure.

FIG. 9 shows a graph providing exemplary thermostability data obtainedusing an exemplary modified polymerase according to the disclosure.

FIG. 10 shows a graph providing exemplary thermostability data obtainedusing an exemplary modified polymerase according to the disclosure.

FIG. 11 shows a graph providing exemplary thermostability data at 95° C.obtained using exemplary modified polymerases according to thedisclosure.

FIG. 12 shows a graph providing exemplary thermostability data obtainedat 96° C. using exemplary modified polymerases according to thedisclosure.

FIG. 13 shows a graph providing exemplary thermostability data obtainedat 95° C. in the absence of trehalose using exemplary modifiedpolymerases according to the disclosure.

FIG. 14 is a schematic outlining an exemplary thermostability activityassay performed according to the disclosure.

DETAILED DESCRIPTION

Unless defined otherwise, all technical and scientific terms used hereinhave the same meaning as is commonly understood by one of ordinary skillin the art to which these inventions belong. All patents, patentapplications, published applications, treatises and other publicationsreferred to herein, both supra and infra, are incorporated by referencein their entirety. If a definition and/or description is explicitly orimplicitly set forth herein that is contrary to or otherwiseinconsistent with any definition set forth in the patents, patentapplications, published applications, and other publications that areherein incorporated by reference, the definition and/or description setforth herein prevails over the definition that is incorporated byreference.

The practice of the disclosure will employ, unless otherwise indicated,conventional techniques of molecular biology, microbiology andrecombinant DNA techniques, which are within the skill of the art. Suchtechniques are explained fully in the literature. See, for example,Sambrook, J., and Russell, D. W., 2001, Molecular Cloning: A LaboratoryManual, Third Edition; Ausubel, F. M., et al., eds., 2002, ShortProtocols In Molecular Biology, Fifth Edition.

Note that not all of the activities described in the general descriptionor the examples are required, that a portion of a specific activity maynot be required, and that one or more further activities may beperformed in addition to those described. Still further, the order inwhich activities are listed are not necessarily the order in which theyare performed.

In some instances, some concepts have been described with reference tospecific embodiments. However, one of ordinary skill in the art willappreciate that various modifications and changes can be made withoutdeparting from the scope of the invention as set forth in the claimsbelow. Accordingly, the specification and figures are to be regarded inan illustrative rather than a restrictive sense, and all suchmodifications are intended to be included within the scope of invention.

As used herein, the terms “comprising” (and any form or variant ofcomprising, such as “comprise” and “comprises”), “having” (and any formor variant of having, such as “have” and “has”), “including” (and anyform or variant of including, such as “includes” and “include”), or“containing” (and any form or variant of containing, such as “contains”and “contain”), are inclusive or open-ended and do not excludeadditional, unrecited additives, components, integers, elements ormethod steps. For example, a process, method, article, or apparatus thatcomprises a list of features is not necessarily limited only to thosefeatures but may include other features not expressly listed or inherentto such process, method, article, or apparatus.

Unless expressly stated to the contrary, “or” refers to an inclusive-orand not to an exclusive-or. For example, a condition A or B is satisfiedby any one of the following: A is true (or present) and B is false (ornot present), A is false (or not present) and B is true (or present),and both A and B are true (or present).

Benefits, other advantages, and solutions to problems have beendescribed with regard to specific embodiments. However, such benefits,advantages, solutions to problems, and any feature(s) that may cause anybenefit, advantage, or solution to occur or become more pronounced arenot to be construed as a critical, required, or essential feature of anyor all the claims.

After reading the specification, skilled artisans will appreciate thatcertain features that are, for clarity, described herein in the contextof separate embodiments can also be provided in combination in a singleembodiment. Conversely, various features that are, for brevity,described in the context of a single embodiment can also be providedseparately or in any subcombination. Further, references to valuesstated in ranges include each and every value within that range.

Also, the use of articles such as “a”, “an” or “the” are employed todescribe elements and components described herein. This is done merelyfor convenience and to give a general sense of the scope of theinvention. This description should be read to include one or at leastone and the singular also includes the plural unless it is obvious thatit is meant otherwise. Accordingly, the terms “a,” “an,” and “the” andsimilar referents used herein are to be construed to cover both thesingular and the plural unless their usage in context indicatesotherwise. Accordingly, the use of the word “a” or “an” or “the” whenused in the claims or specification, including when used in conjunctionwith the term “comprising”, may mean “one,” but it is also consistentwith the meaning of “one or more,” “at least one,” and “one or more thanone.”

As used herein, the term “polymerase” and its variants comprise anyenzyme that can catalyze the polymerization of nucleotides (includinganalogs thereof) into a nucleic acid strand. Typically, but notnecessarily such nucleotide polymerization can occur in atemplate-dependent fashion. Such polymerases can include withoutlimitation naturally occurring polymerases and any subunits andtruncations thereof, mutant polymerases, variant polymerases,recombinant, fusion or otherwise engineered polymerases, chemicallymodified polymerases, synthetic molecules or assemblies, and anyanalogs, homologs, derivatives or fragments thereof that retain theability to catalyze such polymerization. Optionally, the polymerase canbe a mutant polymerase comprising one or more mutations involving thereplacement of one or more amino acids with other amino acids, theinsertion or deletion of one or more amino acids from the polymerase, orthe linkage of parts of two or more polymerases, including linking twoor more parts from different species or families of polymerases.Typically, the polymerase comprises one or more active sites at whichnucleotide binding and/or catalysis of nucleotide polymerization canoccur. Some exemplary polymerases include without limitation DNApolymerases (such as for example Phi-29 DNA polymerase, Taq polymerase,reverse transcriptases and E. coli DNA polymerase) and RNA polymerases.The term “polymerase” and its variants, as used herein, also refers tofusion proteins comprising at least two portions linked to each other,where the first portion comprises a peptide that can catalyze thepolymerization of nucleotides into a nucleic acid strand and is linkedto a second portion that comprises a second polypeptide. In someembodiments, the second polypeptide can include a reporter enzyme or aprocessivity-enhancing domain.

As used herein, the terms “link”, “linked”, “linkage” and variantsthereof comprise any type of fusion, bond, adherence or association thatis of sufficient stability to withstand use in the particular biologicalapplication of interest. Such linkage can comprise, for example,covalent, ionic, hydrogen, dipole-dipole, hydrophilic, hydrophobic, oraffinity bonding, bonds or associations involving van der Waals forces,mechanical bonding, and the like. Optionally, such linkage can occurbetween a combination of different molecules, including but not limitedto: between a nanoparticle and a protein; between a protein and a label;between a linker and a functionalized nanoparticle; between a linker anda protein; between a nucleotide and a label; and the like. Some examplesof linkages can be found, for example, in Hermanson, G., BioconjugateTechniques, Second Edition (2008); Aslam, M., Dent, A., Bioconjugation:Protein Coupling Techniques for the Biomedical Sciences, London:Macmillan (1998); Aslam, M., Dent, A., Bioconjugation: Protein CouplingTechniques for the Biomedical Sciences, London: Macmillan (1998).

The terms “modification” or “modified” and their variants, as usedherein with reference to a polypeptide or protein, for example apolymerase, comprise any change in the structural, biological and/orchemical properties of the protein. In some embodiments, themodification can include a change in the amino acid sequence of theprotein. For example, the modification can optionally include one ormore amino acid mutations, including without limitation amino acidadditions, deletions and substitutions (including both conservative andnon-conservative substitutions).

The term “conservative” and its variants, as used herein with referenceto any change in amino acid sequence, refers to an amino acid mutationwherein one or more amino acids is substituted by another amino acidhaving highly similar properties. For example, one or more amino acidscomprising nonpolar or aliphatic side chains (for example, glycine,alanine, valine, leucine, or isoleucine) can be substituted for eachother. Similarly, one or more amino acids comprising polar, unchargedside chains (for example, serine, threonine, cysteine, methionine,asparagine or glutamine) can be substituted for each other. Similarly,one or more amino acids comprising aromatic side chains (for example,phenylalanine, tyrosine or tryptophan) can be substituted for eachother. Similarly, one or more amino acids comprising positively chargedside chains (for example, lysine, arginine or histidine) can besubstituted for each other. Similarly, one or more amino acidscomprising negatively charged side chains (for example, aspartic acid orglutamic acid) can be substituted for each other. In some embodiments,the modified polymerase or biologically active fragment thereof is avariant that comprises one or more of these conservative amino acidsubstitutions, or any combination thereof. In some embodiments,conservative substitutions for leucine include: alanine, isoleucine,valine, phenylalanine, tryptophan, methionine, and cysteine. In otherembodiments, conservative substitutions for asparagine include:arginine, lysine, aspartate, glutamate, and glutamine.

Throughout this disclosure, various amino acid mutations, including, forexample, amino acid substitutions are referenced using the amino acidsingle letter code, and indicating the position of the residue within areference amino acid sequence. In the case of amino acid substitutions,the identity of the substituent is also indicated using the amino acidsingle letter code. For example, a reference to the hypothetical aminoacid substitution “E397V, wherein the numbering is relative to the aminoacid sequence of SEQ ID NO: 1” indicates an amino acid substitutionwherein an Valine (V) residue is substituted for the normally occurringglutamic acid (E) residue at amino acid position 397 of the amino acidsequence of SEQ ID NO: 1. Some of the amino acid sequences disclosedherein begin with a methionine residue (“M”), which is typicallyintroduced at the beginning of nucleic acid sequences encoding peptidesdesired to be expressed in bacterial host cells. However, it is to beunderstood that the disclosure also encompasses all such amino acidsequences beginning from the second amino acid residue onwards, withoutthe inclusion of the first methionine residue.

As used herein, the terms “identical” or “percent identity,” and theirvariants, when used in the context of two or more nucleic acid orpolypeptide sequences, refer to two or more sequences (or subsequencessuch as biologically active fragments) that are the same or have aspecified percentage of amino acid residues or nucleotides that are thesame, when compared and aligned for maximum correspondence, as measuredusing any one or more of the following sequence comparison algorithms:Needleman-Wunsch (see, e.g., Needleman, Saul B.; and Wunsch, ChristianD. (1970). “A general method applicable to the search for similaritiesin the amino acid sequence of two proteins” Journal of Molecular Biology48 (3):443-53); Smith-Waterman (see, e.g., Smith, Temple F.; andWaterman, Michael S., “Identification of Common Molecular Subsequences”(1981) Journal of Molecular Biology 147:195-197); or BLAST (Basic LocalAlignment Search Tool; see, e.g., Altschul S F, Gish W, Miller W, MyersE W, Lipman D J, “Basic local alignment search tool” (1990) J Mol Biol215 (3):403-410).

As used herein, the terms “identical” or “identity”, and their variants,when used in the context of two or more nucleic acid or polypeptidesequences, refer to two or more sequences or subsequences (such asbiologically active fragments) that have at least 60%, at least 70%, atleast 75%, at least 80%, at least 85%, at least 90%, at least 95%, atleast 96%, at least 97%, at least 98%, or at least 99% nucleotide oramino acid residue identity, when compared and aligned for maximumcorrespondence, as measured using a sequence comparison algorithm or byvisual inspection. Substantially identical sequences are typicallyconsidered to be “homologous” without reference to actual ancestry.

Proteins and/or protein subsequences (such as biologically activefragments) are “homologous” when they are derived, naturally orartificially, from a common ancestral protein or protein sequence.Similarly, nucleic acids and/or nucleic acid sequences are homologouswhen they are derived, naturally or artificially, from a commonancestral nucleic acid or nucleic acid sequence. Homology is generallyinferred from sequence similarity between two or more nucleic acids orproteins (or biologically active fragments or sequences thereof). Theprecise percentage of similarity between sequences that is useful inestablishing homology varies with the nucleic acid and protein at issue,but as little as 25% sequence similarity over 25, 50, 100, 150, or morenucleic acids or amino acid residues, is routinely used to establishhomology. Higher levels of sequence similarity, e.g., 50%, 60%, 70%,80%, 85%, 90%, 95%, 98% or 99%, can also be used to establish homology.

Methods for determining sequence similarity percentages (e.g., BLASTPand BLASTN using default parameters) are described herein and aregenerally available. For sequence comparison and homology determination,typically one sequence acts as a reference sequence to which testsequences are compared. Generally, when using a sequence comparisonalgorithm, test and reference sequences are input into a computer,subsequence coordinates are designated, if necessary, and sequencealgorithm program parameters are designated. The sequence comparisonalgorithm then calculates the percent sequence identity for the testsequence(s) relative to the reference sequence, based on the designatedprogram parameters. Optimal alignment of sequences for comparison can beconducted, e.g., by the local homology algorithm of Smith & Waterman,Adv. Appl. Math. 2:482 (1981), by the homology alignment algorithm ofNeedleman & Wunsch, J. Mol. Biol. 48:443 (1970), by the search forsimilarity method of Pearson & Lipman, Proc. Nat'l. Acad. Sci. USA85:2444 (1988), by computerized implementations of these algorithms(GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics SoftwarePackage, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or byvisual inspection (see generally Current Protocols in Molecular Biology,Ausubel et al., eds., Current Protocols, a joint venture between GreenePublishing Associates, Inc. and John Wiley & Sons, Inc., supplementedthrough 2004).

One example of an algorithm that is suitable for determining percentsequence identity and sequence similarity (homology) is the BLASTalgorithm, which is described in Altschul et al., J Mol. Biol.215:403-410 (1990). Software for performing BLAST analyses is publiclyavailable through the National Center for Biotechnology Information.This algorithm involves first identifying high scoring sequence pairs(HSPs) by identifying short words of length “W” in the query sequence,which either match or satisfy some positive-valued threshold score “T”when aligned with a word of the same length in a database sequence. “T”is referred to as the neighborhood word score threshold (Altschul etal., supra). These initial neighborhood word hits act as seeds forinitiating searches to find longer HSPs containing them. The word hitsare then extended in both directions along each sequence for as far asthe cumulative alignment score can be increased. Cumulative scores arecalculated using, for nucleotide sequences, the parameters “M” (rewardscore for a pair of matching residues; always >0) and “N” (penalty scorefor mismatching residues; always <0). For amino acid sequences, ascoring matrix is used to calculate the cumulative score. Extension ofthe word hits in each direction are halted when: the cumulativealignment score falls off by the quantity X from its maximum achievedvalue; the cumulative score goes to zero or below, due to theaccumulation of one or more negative-scoring residue alignments; or theend of either sequence is reached. The BLAST algorithm parameters “W”,“T”, and “X” determine the sensitivity and speed of the alignment. TheBLASTN program (for nucleotide sequences) uses as defaults a wordlength(W) of 11, an expectation (E) of 10, a cutoff of 100, M=5, N=−4, and acomparison of both strands. For amino acid sequences, the BLASTP programuses as defaults a wordlength (W) of 3, an expectation (E) of 10, andthe BLOSUM62 scoring matrix (see Henikoff & Henikoff (1989) Proc. Natl.Acad. Sci. USA 89:10915).

In addition to calculating percent sequence identity, the BLASTalgorithm also performs a statistical analysis of the similarity betweentwo sequences (see, e.g., Karlin & Altschul, Proc. Nat'l. Acad. Sci. USA90:5873-5787 (1993)). One measure of similarity provided by the BLASTalgorithm is the smallest sum probability (P(N)), which provides anindication of the probability by which a match between two nucleotide oramino acid sequences would occur by chance. For example, a nucleic acidis considered similar to a reference sequence if the smallest sumprobability in a comparison of the test nucleic acid to the referencenucleic acid is less than about 0.1, less than about 0.01, or less thanabout 0.001.

The term “primer extension activity” and its variants, as used herein,when used in reference to a given polymerase, comprise any in vivo or invitro enzymatic activity characteristic of a given polymerase thatrelates to catalyzing nucleotide incorporation onto the terminal 3′OHend of an extending nucleic acid molecule. Typically, but notnecessarily such nucleotide incorporation occurs in a template-dependentfashion. In some embodiments, the primer extension activity of a givenpolymerase can be quantified as the total number of nucleotidesincorporated (as measured by, e.g., radiometric or other suitable assay)by a unit amount of polymerase (in moles) per unit time (seconds) undera particular set of reaction conditions.

The term “thermostability” and its variants, as used herein, when usedin reference to a given polymerase, comprise any in vivo or in vitroenzymatic activity characteristic of a given polymerase that relates tocatalyzing nucleotide incorporation at moderately high temperaturewithout the loss of properties that relate to catalyzing nucleotideincorporation. Typically, but not necessarily, such nucleotideincorporation occurs in a template-dependent fashion. In someembodiments, the thermostability of a given polymerase can be quantifiedas the total number of nucleotides incorporated (as measured by, e.g.,radiometric or other suitable assay) by a unit amount of polymerase (inmoles) per unit time (minute) at a given temperature (° C. or ° F.). Insome embodiments, the thermostability of a given polymerase can bequantified by measuring polymerization activity by a unit amount ofpolymerase (in moles) after incubation at 95° C. for 40 minutes. In oneembodiment, the thermostability of a given polymerase can be quantifiedby measuring polymerization activity based on the half-life of thepolymerase. For example, Taq has a half-life of greater than 2 hours at92.5° C.; 40 minutes at 95° C., and 9 minutes at 97.5° C. (Lawyer etal., (1993) PCR Methods Appl., 2 (4) 275-87. Some of the examplesdescribed herein compare the relative amounts of nucleotidepolymerization of a reference polymerase to a modified polymerase (e.g.,nucleotide polymerization using SEQ ID NO: 1 as compared to nucleotidepolymerization using SEQ ID NO: 2). In these examples, the nucleotidepolymerization properties of the reference polymerase and the modifiedpolymerase (or biologically active fragment thereof) are assessed underidentical conditions that include elevated temperatures, such as, 95°C., 96° C., or 97° C. for various times such as 2 minutes, 4 minutes, 6minutes or 8 minutes (See e.g. Example 10, FIGS. 11-14) beforeperforming a PCR reaction using the polymerase.

Thermostable polymerases generally have an optimal effect at about 70°C. (for Thermus aquaticus (Taq), it is 74° C. and Taq demonstratesinsertion of approximately 2800 nucleotides/min at 70° C., 1400nucleotides/min at 55° C., 90 nucleotides/min at 37° C. and about 15nucleotides/min at 22° C.). Polymerases from Pyrococcus furiosus (Pfu),Pyrococcus woesei (Pwo), Thermatoga maritima (Tma) and ThermococcusLitoralis (Tli or Vent) are also encompassed within the scope of thepresent disclosure. These polymerases demonstrate substantially highertemperature stability than Thermus aquaticus (Taq).

The term “accuracy” and its variants, as used herein (such as “raw readaccuracy”) when used in reference to a given polymerase, comprises thelongest perfect read (typically measured in terms of the number ofnucleotides correctly included in the read) obtained from a nucleotidepolymerization reaction. Accordingly, average read accuracy, as usedherein, when referring to a given polymerase refers to the “average”perfect read obtained from a nucleotide polymerization reaction.

The term “DNA binding activity” and its variants, as used herein, whenused in reference to a given polymerase, comprise any in vivo or invitro enzymatic activity characteristic of a given polymerase thatrelates to interaction of the polymerase with a DNA sequence in arecognition-based manner. Typically, but not necessarily suchinteraction includes binding of the polymerase, and more specificallybinding of the DNA-binding domain of the polymerase, to the recognizedDNA sequence. In some embodiments, recognition includes binding of thepolymerase to a sequence-specific or non-sequence specific DNA sequence.In some embodiments, the DNA binding activity of a given polymerase canbe quantified as the affinity of the polymerase to recognize and bind tothe recognized DNA sequence. For example, DNA binding activity can bemonitored and determined using an anistrophy signal change (or othersuitable assay) as a protein-DNA complex is formed under a particularset of reaction conditions.

As used herein, the term “biologically active fragment” and itsvariants, when used in reference to a given biomolecule, refers to anyfragment, derivative, homolog or analog of the biomolecule thatpossesses an in vivo or in vitro activity that is characteristic of thebiomolecule itself. For example, a polymerase can be characterized byvarious biological activities, for example DNA binding activity,nucleotide polymerization activity, primer extension activity, stranddisplacement activity, reverse transcriptase activity, nick-initiatedpolymerase activity, 3′-5′ exonuclease (proofreading) activity,thermostability, accuracy, processivity, and the like. In someembodiments, a “biologically active fragment” of a polymerase is anyfragment, derivative, homolog or analog of the polymerase that cancatalyze the polymerization of nucleotides (including homologs andanalogs thereof) into a nucleic acid strand. In some embodiments, thebiologically active fragment, derivative, homolog or analog of thepolymerase possesses 10%, 20%, 30%, 40%, 50%, 60%, 70%, 75%, 80%, 85%,90% 95%, or 98% or greater of the biological activity of the polymerasein any in vivo or in vitro assay of interest such as, for example, DNAbinding assays, nucleotide polymerization assays (which may betemplate-dependent or template-independent), primer extension assays,strand displacement assays, reverse transcriptase assays, proofreadingassays, accuracy assays, thermostability assays and the like.

In some embodiments, the biological activity of a polymerase fragment isassayed by measuring the primer extension activity in vitro of thefragment under defined reaction conditions. In some embodiments, thebiological activity of a polymerase fragment is assayed by measuring thepolymerization activity in vitro of the fragment under defined reactionconditions. In some embodiments, the biological activity of a polymerasefragment is assayed by measuring the thermostability in vitro of thefragment under defined reaction conditions. In some embodiments, thebiological activity of a polymerase fragment is assayed by measuring theaccuracy in vitro of the fragment under defined reaction conditions. Insome embodiments, the biological activity of a polymerase fragment isassayed by measuring the processivity in vitro of the fragment underdefined reaction conditions. In some embodiments, the biologicalactivity of a polymerase fragment is assayed by measuring the stranddisplacement activity in vitro of the fragment under defined reactionconditions. In some embodiments, the biological activity of a polymerasefragment is assayed by measuring the read-length activity in vitro ofthe fragment under defined reaction conditions. In some embodiments, thebiological activity of a polymerase fragment is assayed by measuring thestrand bias activity in vitro of the fragment under defined reactionconditions. In some embodiments, the biological activity of a polymerasefragment is assayed by measuring the proofreading activity in vitro ofthe fragment under defined reaction conditions. In some embodiments, thebiological activity of a polymerase fragment is assayed by measuring theoutput of an in vitro assay such as sequencing throughput or averageread length as performed by the polymerase fragment under definedreaction conditions. In some embodiments, the biological activity of apolymerase fragment is assayed by measuring the output of a nucleotidepolymerization reaction in vitro such as raw accuracy of the polymerasefragment to incorporate correct Watson-Crick nucleotides in thenucleotide polymerization reaction under defined reaction conditions. Insome embodiments, the biologically active fragment of a polymerase caninclude measuring the biological activity of any one or more of thepolymerase biological activities outlined herein.

In some embodiments, a biologically active fragment can include any partof the DNA binding domain or any part of the catalytic domain of themodified polymerase. In some embodiments, the biologically activefragment can optionally include any 25, 50, 75, 100, 150 or morecontiguous amino acid residues of the DNA binding or catalytic domain.In some embodiments, a biologically active fragment of the modifiedpolymerase can include at least 25 contiguous amino acid residues of thecatalytic domain or the DNA binding domain having at least 80%, 85%,90%, 95%, 98%, or 99% identity to any one or more of the polymerasesencompassed by the disclosure. In some embodiments, a biologicallyactive fragment of a modified polymerase can include at least 25contiguous amino acid residues of the catalytic domain or the DNAbinding domain having at least 80%, 85%, 90%, 95%, 98%, or 99% identityto any one or more of SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ IDNO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ IDNO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19,SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO:24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ IDNO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 and SEQ ID NO: 33.

Biologically active fragments can optionally exist in vivo, such as, forexample, fragments which arise from post transcriptional processing orwhich arise from translation of alternatively spliced RNAs, oralternatively can be created through engineering, bulk synthesis, orother suitable manipulation. Biologically active fragments includefragments expressed in native or endogenous cells as well as those madein expression systems such as, for example, in bacterial, yeast, insector mammalian cells.

In some embodiments, the disclosure relates generally to not only thespecific polymerases disclosed herein, but also to any biologicallyactive fragment of such polymerases, which are encompassed within thescope of the present disclosure. In some embodiments, a biologicallyactive fragment of any polymerase of the disclosure includes anyfragment that exhibits primer extension activity in vitro. In someembodiments, a biologically active fragment of any polymerase of thedisclosure includes any fragment that exhibits DNA binding activity invitro. In some embodiments, a biologically active fragment of anypolymerase of the disclosure includes any fragment that retainspolymerase activity in vitro. Polymerase activity can be determined byany method known in art. For example, determination of polymeraseactivity can be based on the activity of extending a primer on atemplate.

In some embodiments, the disclosure generally relates to a modifiedpolymerase having one or more amino acid mutations (such as a deletion,substitution or addition) relative to a reference polymerase lacking theone or more amino acid mutations, and wherein the modified polymeraseretains polymerase activity in vitro or exhibits primer extensionactivity in vitro. In some embodiments, the modified polymerase includesany biologically active fragment of such polymerase that retainsprocessivity in vitro or exhibits thermostability activity in vitro.

In some embodiments, the disclosure generally relates to a modifiedpolymerase having one or more amino acid mutations (such as a deletion,substitution or addition) relative to a reference polymerase lacking theone or more amino acid mutations, and wherein the modified polymeraseretains proofreading activity in vitro. Determination of whether apolymerase exhibits exonuclease activity or exhibits reduced exonucleaseactivity, can be readily determined by standard methods. For example,polynucleotides can be synthesized such that a detectable proportion ofthe nucleotides are radioactively labeled. These polynucleotides can beincubated in an appropriate buffer in the presence of the polypeptide tobe tested. After incubation, the polynucleotide is precipitated andexonuclease activity is detectable as radioactive counts due to freenucleotides in the supernatant. As will be appreciated by the skilledartisan, an appropriate polymerase or biologically active fragment maybe selected from those described herein based on any of the abovebiological activities, or combinations thereof, depending on theapplication of interest.

As used herein, the term “nucleotide” and its variants comprise anycompound that can bind selectively to, or can be polymerized by, apolymerase. Typically, but not necessarily, selective binding of thenucleotide to the polymerase is followed by polymerization of thenucleotide into a nucleic acid strand by the polymerase; occasionallyhowever the nucleotide may dissociate from the polymerase withoutbecoming incorporated into the nucleic acid strand, an event referred toherein as a “non-productive” event. Such nucleotides include not onlynaturally-occurring nucleotides but also any analogs, regardless oftheir structure, that can bind selectively to, or can be polymerized by,a polymerase. While naturally-occurring nucleotides typically comprisebase, sugar and phosphate moieties, the nucleotides of the disclosurecan include compounds lacking any one, some or all of such moieties. Insome embodiments, the nucleotide can optionally include a chain ofphosphorus atoms comprising three, four, five, six, seven, eight, nine,ten or more phosphorus atoms. In some embodiments, the phosphorus chaincan be attached to any carbon of a sugar ring, such as the 5′ carbon.The phosphorus chain can be linked to the sugar with an intervening O orS. In one embodiment, one or more phosphorus atoms in the chain can bepart of a phosphate group having P and O. In another embodiment, thephosphorus atoms in the chain can be linked together with intervening O,NH, S, methylene, substituted methylene, ethylene, substituted ethylene,CNH₂, C(O), C(CH₂), CH₂CH₂, or C(OH)CH₂R (where R can be a 4-pyridine or1-imidazole). In one embodiment, the phosphorus atoms in the chain canhave side groups having 0, BH₃, or S. In the phosphorus chain, aphosphorus atom with a side group other than O can be a substitutedphosphate group. Some examples of nucleotide analogs are described inXu, U.S. Pat. No. 7,405,281. In some embodiments, the nucleotidecomprises a label (e.g., reporter moiety) and referred to herein as a“labeled nucleotide”; the label of the labeled nucleotide is referred toherein as a “nucleotide label”. In some embodiments, the label can be inthe form of a fluorescent dye attached to the terminal phosphate group,i.e., the phosphate group or substitute phosphate group most distal fromthe sugar. Some examples of nucleotides that can be used in thedisclosed methods and compositions include, but are not limited to,ribonucleotides, deoxyribonucleotides, modified ribonucleotides,modified deoxyribonucleotides, ribonucleotide polyphosphates,deoxyribonucleotide polyphosphates, modified ribonucleotidepolyphosphates, modified deoxyribonucleotide polyphosphates, peptidenucleotides, metallonucleosides, phosphonate nucleosides, and modifiedphosphate-sugar backbone nucleotides, analogs, derivatives, or variantsof the foregoing compounds, and the like. In some embodiments, thenucleotide can comprise non-oxygen moieties such as, for example, thio-or borano-moieties, in place of the oxygen moiety bridging the alphaphosphate and the sugar of the nucleotide, or the alpha and betaphosphates of the nucleotide, or the beta and gamma phosphates of thenucleotide, or between any other two phosphates of the nucleotide, orany combination thereof.

As used herein, the term “nucleotide incorporation” and its variantscomprise polymerization of one or more nucleotides to form a nucleicacid strand including at least two nucleotides linked to each other,typically but not necessarily via phosphodiester bonds, althoughalternative linkages may be possible in the context of particularnucleotide analogs.

As used herein, the term “processivity” and its variants comprise theability of a polymerase to remain bound to a single primer/templatehybrid. The term processivity as used herein, when used in reference toa given polymerase, comprises the number of nucleotides that apolymerase attaches to the 3′ end of a nucleic acid (e.g., the 3′-OHgroup of a DNA strand) in a single cycle. This number represents therate of polymerization and the dissociation constant (K_(d)) of thepolymerase. In some embodiments, processivity can be measured by thenumber of nucleotides that a polymerase incorporates into a nucleic acid(such as a sequencing primer) prior to dissociation of the polymerasefrom the primer/template hybrid. In some embodiments, the polymerase hasa processivity of at least 100 nucleotides, although in otherembodiments it has a processivity of at least 200 nucleotides, at least300 nucleotides, at least 400 nucleotides, at least 500 nucleotides orgreater. It will be understood by those of ordinary skill in the artthat the higher the processivity of the polymerase, the more nucleotidesthat can be incorporated prior to dissociation and therefore the longerthe sequence (read-length) that can be obtained. In other words,polymerases having low processivity will typically provide shorteraverage read-lengths than will polymerases having higher processivity.In one embodiment, polymerases of the instant disclosure containing oneor more amino acid mutations can possess improved processivity ascompared to a polymerase lacking the one or more amino acid mutations

In one exemplary assay, the processivity of a given polymerase can bemeasured by incubating the polymerase with a primer:template duplexunder nucleotide incorporation conditions, and resolving the resultingprimer extension products using any suitable method, for example via gelelectrophoresis. The primer can optionally include a label to enhancedetectability of the primer extension products. The nucleotideincorporation reaction mixture typically includes a vast excess ofunlabeled competitor template, thereby ensuring that virtually all ofthe extension products are produced through a single template bindingevent. Following such resolution, the average amount of full-lengthextension products can be quantified using any suitable means, includingfluorimetric or radiometric detection of full-length extension products.To compare the processivity of two or more different enzymes (e.g.,reference and modified polymerases), each enzyme can be employed in aparallel and separate reaction, following which the resultingfull-length primer extension products can be resolved and measured, andsuch measurements compared.

In other exemplary embodiments, the processivity of a given polymerasecan be measured using any suitable assay known in the art, including butnot limited to the assays described in Von Hippel, P. H., Faireld, F. R.and Dolejsi, M. K., On the processivity of polymerases, Ann. NY Acad.Sci., 726:118-131 (1994); Bambara, R. A., Uyemura, D. and Choi, T., Onthe processive mechanism of Escherichia coli DNA polymerase IQuantitative assessment of processivity, J. Biol. Chem., 253:413-423(1978); Das, S. K. and Fujimura, R. K., Processiveness of DNApolymerases. A comparative study using a simple procedure, J. Biol.Chem., 254: 1227-1232 (1979); Nasir, M. S. and Jolley, M. E.,Fluorescence polarization: An Analytical Tool for Immunoassay and DrugDiscovery, Combinational Chemistry and High Throughput Screening,2:177-190 (1999); Mestas, S. P., Sholders, A. J., and Peersen, O. B., AFluorescence Polarization Based Screening Assay for Nucleic AcidPolymerase Elongation Activity, Anal. Biochem., 365:194-200 (2007);Nikiforov, T. T., Fluorogenic polymerase, endonuclease, and ligaseassays based on DNA substrates labeled with a single fluorophore,Analytical Biochemistry 412: 229-236; and Yan Wang, Dennis E. Prosen, LiMei, John C. Sullivan, Michael Finney and Peter B. Vander Horn, NucleicAcids Research, 32(3):1197-1207 (2004).

The terms “read length” or “read-length” and their variants, as usedherein, refer to the number of nucleotides that are polymerized (orincorporated into an existing nucleic acid strand) in atemplate-dependent manner by a polymerase prior to dissociation from atemplate nucleic acid strand. In some embodiments, a polymerase thatdissociates from the template nucleic acid strand after fiveincorporations will typically provide a sequence having a read length of5 nucleotides, while a polymerase that dissociates from the templatenucleic acid strand after 500 nucleotide incorporations will typicallyprovide a sequence having a read length of about 500 nucleotides. Whilethe actual or absolute processivity of a given polymerase (or the actualread length of polymerization products produced by the polymerase) canvary from reaction to reaction (or even within a single reaction mixturewherein the polymerase produces different products having different readlengths), the polymerase can be characterized by the averageprocessivity (or average read length of polymerization products)observed under a defined set of reaction conditions. The “error-freeread length” comprises the number of nucleotides that are consecutivelyand contiguously incorporated without error (i.e., without mismatchand/or deviation from an established and predictable set of base pairingrules) into the newly synthesized nucleic acid strand.

The terms “systematic error” or “SE” and its variants, as used herein,refers to the percentage of errors present in a sequence motifcontaining a homopolymer of a defined length, with systematic deletionoccurring on the nucleic acid strand at a specified minimum frequency,and with sequencing coverage occurring at a specified minimum frequency.For example, in some embodiments the systematic error can be measured asthe percentage of errors in sequence motifs containing homopolymers oflength 1-6, with systematic deletion occurring on strand with afrequency greater than 15%, when coverage (of the sequencing run) isequal to or greater than 20×. In some embodiments, the systematic erroris estimated as the percentage of stochastic errors in sequence motifscontaining homopolymers of length 1-6, with systematic deletionoccurring on strand with a frequency greater than 15%, when coverage (ofthe sequencing run) is equal to or greater than 20×; such embodimentsare the focus of several of the working examples disclosed herein. Insome embodiments, the percentage of systematic error is lowered whenusing a modified polymerase as disclosed herein as compared to areference polymerase (e.g., a wild-type Taq polymerase) that does notcontain one or more amino acid modifications. While the actualsystematic error of a given polymerase can vary from reaction toreaction (or even within a single reaction mixture) the polymerase canbe characterized by the percentage systematic error observed under adefined set of reaction conditions. In some embodiments, the modifiedpolymerases of the instant application have a lowered systematic errorpercentage as compared to a corresponding reference polymerase nothaving the one or more amino acid modifications. In some embodiments,the modified polymerase, as disclosed herein, contains a systematicerror percentage of less than 3%. In some embodiments, the modifiedpolymerase, as disclosed herein, contain a systematic error percentageof less than 1%. In some embodiments, the modified polymerases asdisclosed herein contain a systematic error percentage of less than0.9%. In some embodiments, the modified polymerases as disclosed hereincontain a systematic error percentage of less than 0.8%. In someembodiments, the modified polymerases as disclosed herein contain asystematic error percentage of less than 0.7%. In some embodiments, themodified polymerases as disclosed herein contain a systematic errorpercentage of less than 0.6%. In some embodiments, the modifiedpolymerases as disclosed herein contain a systematic error percentage ofless than 0.5%. In some embodiments, the modified polymerases asdisclosed herein contain a systematic error percentage of less than0.4%. In some embodiments, the modified polymerases as disclosed hereincontain a systematic error percentage of less than 0.3%. In someembodiments, the modified polymerases as disclosed herein contain asystematic error percentage of less than 0.2%. In some embodiments, themodified polymerases as disclosed herein contain a systematic errorpercentage of less than 0.1%. In some embodiments, the modifiedpolymerases as disclosed herein contain a systematic error percentage ofless than 0.09%. In some embodiments, the modified polymerases asdisclosed herein contain a systematic error percentage of less than0.08%. In some embodiments, the modified polymerases as disclosed hereincontain a systematic error percentage of less than 0.05%. In someembodiments, the modified polymerases as disclosed herein contain asystematic error percentage of less than 0.04%.

The term “strand bias” as used herein, refers to the percentage oftarget bases in a sequencing run where the read (genotype) from onestrand (e.g., positive strand) is different from the read (genotype)inferred from the other (e.g., negative) strand. The coverage of a giventarget base can be computed by counting the number of read bases mappedto it in an alignment. The mean coverage can be computed by averagingthis value across every base in the target. Then, the relative coveragefor a particular base can be computed as the ratio of these values. Arelative coverage of 1 indicates that a particular base is covered atthe expected average rate. A relative coverage above 1 indicates higherthan expected coverage and below 1 indicates lower than expectedcoverage. Generally, the probability of ambiguous mapping increases asreads become shorter or less accurate. Ambiguous mapping is also morelikely for reads that derive from repetitive or low complexity regionsof the genome, including some regions with extreme (high) GC content. Insome embodiments, the percentage of strand bias is lowered or reducedwhen using a modified polymerase as disclosed herein, as compared to areference polymerase (e.g., a wild-type Taq polymerase) that does notcontain the corresponding one or more amino acid modifications. In someembodiments, the modified polymerases of the instant application have adecreased (reduced) strand bias as compared to the correspondingnon-modified polymerase. While the actual strand bias of a givenpolymerase can vary from reaction to reaction (or even within a singlereaction mixture) the polymerase can be characterized by the percentageof target bases with no strand bias, observed under a defined set ofreaction conditions.

In some embodiments, the modified polymerases as disclosed hereincomprise a percentage of target bases with no strand bias of above 25%.In some embodiments, the modified polymerases as disclosed hereincomprise a percentage of target bases with no strand bias of about 30%.In some embodiments, the modified polymerases as disclosed hereincomprise a percentage of target bases with no strand bias of about 40%.In some embodiments, the modified polymerases as disclosed hereincomprise a percentage of target bases with no strand bias of about 45%.In some embodiments, the modified polymerases as disclosed hereincomprise a percentage of target bases with no strand bias of about 50%.In some embodiments, the modified polymerases as disclosed hereincomprise a percentage of target bases with no strand bias of about 60%.In some embodiments, the modified polymerases as disclosed hereincomprise a percentage of target bases with no strand bias of about 70%.In some embodiments, the modified polymerases as disclosed hereincomprise a percentage of target bases with no strand bias of about 75%.In some embodiments, the modified polymerases as disclosed hereincomprise a percentage of target bases with no strand bias of about 80%.In some embodiments, the modified polymerases as disclosed hereincomprise a percentage of target bases with no strand bias of about 85%.Conversely, in some embodiments the modified polymerases as disclosedherein can include about 15% percent of target bases with strand bias.In another embodiment, the modified polymerases as disclosed herein caninclude about 20%, 25%, 30%, 35%, 40%, 45% or 50% percent of targetbases with strand bias.

The terms “signal to noise ratio” or “SNR” refer to the ratio of signalpower to noise power. Generally, SNR is a method of measuring a desiredsignal compared to the level of background noise. In some embodiments,“signal to noise ratio” can refer to the ratio of signal power obtainedduring a sequencing run as compared to background noise of the samesequencing run. In some embodiments, the instant application disclosesmethods, kits, apparatuses, and compositions that provide a means toincrease the signal to noise ratio. In some embodiments, the disclosurerelates generally to a method for performing nucleic acid sequencingcomprising contacting a modified polymerase with a nucleic acid templatein the presence of one or more nucleotides, where the modifiedpolymerase includes one or more amino acid modifications (e.g., asubstitution) relative to a reference polymerase and has an increasedsignal to noise ratio relative to the reference polymerase not havingthe one or more amino acid modifications, and polymerizing at least oneof the one or more nucleotides using the modified polymerase.

In some embodiments, the disclosure relates generally to compositions,methods, systems, apparatuses and kits comprising modified polymerasesthat are characterized by increased processivity, increased read length(including error-free read length), increased total sequencingthroughput, improved thermostability and/or increased accuracy ascompared to their unmodified counterpart (e.g., a reference polymerase),as well as to methods for making and using such modified polymerases ina wide range of biological and chemical reactions such as nucleotidepolymerization, primer extension, generation of nucleic acid librariesand nucleic acid sequencing reactions.

In some embodiments, the disclosure relates generally to compositions,methods, systems, apparatuses and kits comprising modified polymerasesthat are characterized by decreased strand bias and/or reducedsystematic error as compared to their unmodified counterparts (e.g., areference polymerase), as well as to methods for making and using suchmodified polymerases in a wide range of biological and chemicalreactions such as nucleotide polymerization, primer extension,generation of nucleic acid libraries and nucleic acid sequencingreactions.

In some embodiments, the modified polymerases encompassed within thescope of the present disclosure include one or more amino acid mutations(e.g., amino acid substitutions, additions or deletions) relative to thecorresponding counterpart lacking the identical mutation(s). In someembodiments, the term “accuracy” as used herein can be measured bydetermining the rate of incorporation of a correct nucleotide duringpolymerization as compared to the rate of incorporation of an incorrectnucleotide during polymerization. In some embodiments, the rate ofincorporation of an incorrect nucleotide can be greater than 0.3, 0.4,0.5, 0.6, 0.7 seconds or more under elevated salt conditions (e.g., highionic strength solution) as compared to standard (low ionic strengthsolution) salt conditions. While not wishing to be bound by anyparticular theory, it has been found by the applicants that the presenceof elevated salt during polymerization slows down the rate of incorrectnucleotide incorporation, thereby producing a slower incorporationconstant for the incorrect nucleotide. In some embodiments, a modifiedpolymerase of the disclosure has enhanced accuracy compared to areference polymerase lacking the corresponding mutation; optionally themodified polymerase or a biological fragment thereof has enhancedaccuracy (as compared to a reference polymerase lacking thecorresponding amino acid mutation) in the presence of a high ionicstrength solution. Generally, a standard ionic strength solution, asused herein, refers to an ionic solution having less than 120 mM salt.In another embodiment, a standard ionic strength solution as used hereinrefers to an ionic solution having less than 100 mM salt.

In some embodiments, the disclosure relates generally to a modifiedpolymerase that retains polymerase activity and/or primer extensionactivity in the presence of a high ionic strength solution. In someembodiments, a high ionic strength solution can be at least 120 mM saltconcentration. In some embodiments, the high ionic strength solution is125 mM to 200 mM salt concentration. In some embodiments, the salt caninclude a potassium and/or sodium salt, such as KCl and/or NaCl. It willbe apparent to the skilled artisan that various other suitable salts canbe used in place, or in combination with KCl and/or NaCl. In someembodiments, the ionic strength solution can further include a sulfate.

In some embodiments, the modified polymerase can amplify and/or sequencea nucleic acid molecule in the presence of a high ionic strengthsolution. In some embodiments, a modified polymerase is capable ofamplifying (and/or sequencing) a nucleic acid molecule in the presenceof a high ionic strength solution to a greater extent (for example asmeasured by “accuracy”) than a reference polymerase lacking one or moreof the corresponding mutations (or homologous mutations) under identicalconditions. In some embodiments, a modified polymerase is capable ofamplifying (and/or sequencing) a nucleic acid molecule in the presenceof a high ionic strength solution to a greater capacity (for example asmeasured by “accuracy”) than a reference polymerase lacking one or moreof the mutations (or homologous mutations) under standard ionic strengthconditions (i.e., low ionic strength as compared to a high ionicstrength solution).

In some embodiments, the disclosure generally relates to a modifiedpolymerase or a biologically active fragment thereof that can performnucleotide polymerization or nucleotide incorporation in the presence ofhigh ionic strength conditions as compared to a reference polymeraseunder the same conditions.

In some embodiments, the disclosure generally relates to a modifiedpolymerase or a biologically active fragment thereof that has increasedaccuracy or increased processivity in the presence of high ionicstrength conditions as compared to a reference polymerase under the sameconditions.

In some embodiments, the disclosure generally relates to a modifiedpolymerase or a biologically fragment thereof that can detect a changein ion concentration during nucleotide polymerization in the presence ofa high ionic strength salt conditions as compared to a referencepolymerase under the same conditions.

In some embodiments, the disclosure generally relates to a modifiedpolymerase or a biologically active fragment thereof that can amplify orsequence a nucleic acid molecule in the presence of a high ionicstrength solution.

In some embodiments, the disclosure generally relates to a modifiedpolymerase or a biologically active fragment thereof that has increasedaccuracy as compared to a reference polymerase under the sameconditions.

In some embodiments, the disclosure relates generally to methods,compositions, systems and kits comprising the use of such modifiedpolymerases in nucleotide polymerization reactions, including nucleotidepolymerization reactions wherein sequence information is obtained from anucleic acid molecule. In some embodiments, the disclosure relatesgenerally to methods, compositions, systems and kits comprising the useof such modified polymerases in clonal amplification reactions,including nucleic acid library synthesis. In some embodiments, thedisclosure relates to methods for using such modified polymerases inion-based nucleic acid sequencing reactions, where sequence informationis obtained from a template nucleic acid using the ion-based sequencingsystem. In some embodiments, the disclosure relates generally tocompositions, methods, systems, kits and apparatuses for carrying out aplurality of label-free DNA sequencing reactions (e.g., ion-basedsequencing reactions) using a large-scale array of electronic sensors,for example field effect transistors (“FETs”).

In some embodiments, the disclosure relates generally to compositions(as well as related methods, systems, kits and apparatuses using suchcompositions) comprising a modified polymerase including at least oneamino acid modification (e.g., amino acid substitution, addition,deletion or chemical modification) relative to a reference polymerase(where the reference polymerase does not include the at least one aminoacid modification), where the modified polymerase is optionallycharacterized by a change (e.g., increase or decrease) in any one ormore of the following properties relative to the reference polymerase:thermostability, read length, accuracy, strand bias, systematic error,total sequencing throughput, performance in salt (i.e., ionic strength)and processivity.

As used herein, the terms “Q17” or “Q20” and their variants, when usedin reference to a given polymerase, refer to certain aspects ofpolymerase performance, particularly accuracy, in a given polymerasereaction, for example in a polymerase-based sequencing by synthesisreaction. For example, in a particular sequencing reaction, accuracymetrics can be calculated either through prediction algorithms orthrough actual alignment to a known reference genome. Predicted qualityscores (“Q scores”) can be derived from algorithms that look at theinherent properties of the input signal and make fairly accurateestimates regarding if a given single base included in the sequencing“read” will align. In some embodiments, such predicted quality scorescan be useful to filter and remove lower quality reads prior todownstream alignment. In some embodiments, the accuracy can be reportedin terms of a Phred-like Q score that measures accuracy on logarithmicscale such that: Q10=90%, Q17=98%, Q20=99%, Q30=99.9%, Q40=99.99%, andQ50=99.999%. Phred quality scores (“Q”) are defined as a property whichis logarithmically related to the base-calling error probabilities(“P”). Often the formula given for calculating “Q” is Q=10*log¹⁰(1/errorrate). In some embodiments, the data obtained from a given polymerasereaction can be filtered to measure only polymerase reads measuring “N”nucleotides or longer and having a Q score that passes a certainthreshold, e.g., Q10, Q17, Q100 (referred to herein as the “NQ17”score). For example, the 100Q20 score can indicate the number of readsobtained from a given reaction that are at least 100 nucleotides inlength and have Q scores of Q20 (99%) or greater. Similarly, the 200Q20score can indicate the number of reads that are at least 200 nucleotidesin length and have Q scores of Q20 (99%) or greater.

In some embodiments, accuracy can also be calculated based on properalignment using a reference genomic sequence, referred to herein as the“raw” accuracy. This is single pass accuracy, involving measurement ofthe “true” per base error associated with a single read, as opposed toconsensus accuracy, which measures the error rate from the consensussequence which is the result of multiple reads. Raw accuracymeasurements can be reported in terms of “AQ” scores (for alignedquality). In some embodiments, the data obtained from a given polymerasereaction can be filtered to measure only polymerase reads measuring “N”nucleotides or longer having a AQ score that passes a certain threshold,e.g., AQ10, AQ17, AQ100 (referred to herein as the “NAQ17” score). Forexample, the 100AQ20 score can indicate the number of reads obtainedfrom a given polymerase reaction that are at least 100 nucleotides inlength and have AQ scores of AQ20 (99%) or greater. Similarly, the200AQ20 score can indicate the number of reads that are at least 200nucleotides in length and have AQ scores of AQ20 (99%) or greater.

In some embodiments, the accuracy of the polymerase (including forexample accuracy in a given sequencing reaction) can be measured interms of the total number of “perfect” (i.e., zero-error) reads obtainedfrom a polymerase reaction that are greater than 100, 200, 300, 400,500, 750, 1000, 5000, 10000, 100000 nucleotides in length.

In some embodiments, the accuracy of the polymerase can be measured interms of the longest perfect read (typically measured in terms of numberof nucleotides included in the read) that is obtained from a polymerasereaction.

In some embodiments, the accuracy of the polymerase can be measured interms of fold-increase in sequencing throughput obtained in a givensequencing reaction. For example, in some embodiments an exemplarymodified polymerase encompassed by the scope of the present disclosuremay have an increased accuracy of 2-fold, 5-fold, 10-fold, 20-fold,50-fold, 75-fold, 100-fold, 150-fold, 200-fold, 400-fold, 500-fold, orgreater, accuracy than a reference polymerase (or an unmodified,naturally occurring polymerase).

In some embodiments, the accuracy of the polymerase can be measured interms of percentage increase in templating efficiency obtained in agiven polymerization reaction. For example, in some embodiments anexemplary modified polymerase encompassed within the scope of thepresent disclosure may have an increased accuracy of 10%, 15%, 20%, 25%,30%, 35%, 40%, 45%, 50%, or greater, accuracy than a referencepolymerase under identical polymerization conditions.

Some exemplary non-limiting descriptions of accuracy metrics can befound in: Ewing B, Hillier L, Wendl M C, Green P. (1998): Base-callingof automated sequencer traces using phred. I. Accuracy assessment.Genome Res. 8(3):175-185; Ewing B, Green P. (1998): Base-calling ofautomated sequencer traces using phred. II. Error probabilities. GenomeRes. 8(3):186-194; Dear S, Staden R (1992): A standard file format fordata from DNA sequencing instruments. DNA Sequence, 3, 107-110; BonfieldJ K, Staden R (1995): The application of numerical estimates of basecalling accuracy to DNA sequencing projects. Nucleic Acids Res. 1995Apr. 25; 23(8):1406-10, herein incorporated by reference in theirentireties.

In some embodiments, the accuracy of a given polymerase (including anyof the reference and/or modified polymerases described herein) can bemeasured in an ion based sequencing reaction; such accuracies canoptionally be compared with each other to determine whether a givenamino acid mutation increases or decreases the sequencing accuracyrelative to a reference and/or unmodified polymerase. In someembodiments, the accuracy of one or more polymerases can be measuredusing any ion-based sequencing apparatus supplied by Ion TorrentTechnologies (Life Technologies Corp., CA), including for example theIon Torrent PGM™ or Proton™ Sequencer, optionally using the sequencingprotocols and reagents provided by Ion Torrent Systems. Some examples ofaccuracy calculations using an ion-based sequencing systems aredescribed in the Ion Torrent Application Note titled “Ion Torrent: IonPersonal Genome Machine™ Performance Overview, Performance Spring 2011”(Life Technologies Corporation, South San Francisco, Calif.), herebyincorporated by reference in its entirety. In some embodiments, theaccuracy of one or more modified polymerases prepared according to thepresent disclosure can be determined using any appropriate method and/orany appropriate next-generation sequencing platform (such as Roche 454GS or Illumina HiSeq, MiSeq or HiSeq X Ten platform).

As used herein, the terms “dissociation rate constant” and “dissociationtime constant”, when used in reference to a given polymerase, refer tothe time constant for dissociation (“koff”) of a polymerase from anucleic acid template under a defined set of reaction conditions. Someexemplary assays for measuring the dissociation time constant of apolymerase are described further below. In some embodiments, thedissociation time constant can be measured in units of inverse time,e.g., sec⁻¹ or min⁻¹.

In some embodiments, the disclosure relates generally to methods (andrelated kits, systems, apparatus and compositions) for using an isolatedmodified polymerase including at least one amino acid modificationrelative to a reference polymerase lacking the at least one amino acidmodification and for providing an increase in average read length ofprimer extension products in a primer extension reaction using themodified polymerase relative to the average read length of primerextension products obtained using the reference polymerase underidentical conditions. In some embodiments, the isolated modifiedpolymerase provides an increase in average error-free read length ofprimer extension products in a primer extension reaction using themodified polymerase, relative to the average error-free read length ofprimer extension products obtained using a corresponding polymeraselacking the one or more amino acid modifications. In some embodiments,the isolated polymerase having at least one amino acid modificationrelative to the reference polymerase, provides an increase in averageerror-free read length of at least 10%, 15%, 20%, 25%, 30%, 35%, 40%,45%, 50%, 55%, 60%, 65%, 70%, or greater, increase in average error-freeread length as compared to the reference polymerase lacking the at leastone amino acid modification under identical conditions. Optionally, themodified polymerase includes one or more amino acid substitutionsrelative to the unmodified polymerase. In some embodiments, the modifiedpolymerase includes two or more amino acid substitutions relative to areference polymerase lacking the two or more amino acid substitutions.In some embodiments, the primer extension reaction is an ion-basedsequencing reaction. In some embodiments, the primer extension reactionis an emPCR based amplification reaction. In some embodiments, theprimer extension reaction is a bridge PCR amplification reaction. Insome embodiments, the primer extension reaction includes a label such asa reversible terminator in the primer extension reaction.

In some embodiments, the reference polymerase is a naturally occurringor wild type polymerase. In some embodiments, the reference polymeraseis a naturally occurring thermostable DNA polymerase. In someembodiments, the reference polymerase is a full-length wild-type Taq DNApolymerase. In some embodiments, the reference polymerase is a truncatedbut amino acid unmodified Taq DNA polymerase (such as Klentaq-235 DNApolymerase). In other embodiments, the reference polymerase includes aderivative, truncated, mutant or variant form of a naturally occurringpolymerase that is different from the modified polymerase. For example,a reference polymerase may omit one or more amino acid mutations (e.g.,one or more substitutions, deletions, or additions) as compared to themodified polymerase.

In some embodiments, the disclosure relates generally to methods forperforming a nucleotide polymerization reaction, comprising: contactinga modified polymerase with a nucleic acid template in the presence ofone or more nucleotides; and polymerizing at least one of the one ormore nucleotides using the modified polymerase. The polymerizingoptionally further includes polymerizing the at least one nucleotide ina template-dependent fashion. In some embodiments, the modifiedpolymerase includes one or more amino acid substitutions relative to areference polymerase that does not include the one or more amino acidsubstitutions.

In some embodiments, the method further includes hybridizing a primer tothe template prior to, during, or after the contacting. The polymerizingcan include polymerizing the at least one nucleotide onto an end of theprimer using the modified polymerase.

In some embodiments, the polymerizing is performed in the proximity of asensor that is capable of detecting the polymerization of the at leastone nucleotide by the modified polymerase.

In some embodiments, the method further includes detecting a signalindicating the polymerization of the at least one of the one or morenucleotides by the modified polymerase using the sensor.

In some embodiments, the modified polymerase, the reference polymerase,or both are a DNA polymerase. The DNA polymerase can include, withoutlimitation, a bacterial DNA polymerase, prokaryotic DNA polymerase,eukaryotic DNA polymerase, archaeal DNA polymerase, viral DNA polymeraseor phage DNA polymerase.

In some embodiments, the DNA polymerase is selected from the groupconsisting of an A family DNA polymerase; a B family DNA polymerase; amixed-type polymerase; an unclassified DNA polymerase and RT familypolymerase; and variants and derivatives thereof.

In some embodiments, the DNA polymerase is an A family DNA polymeraseselected from the group consisting of a Pol I-type DNA polymerase suchas E. coli DNA polymerase, the Klenow fragment of E. coli DNApolymerase, Bst DNA polymerase, Taq DNA polymerase, Platinum Taq DNApolymerase series, Omni Klen Taq DNA polymerase series, Klen Taq DNApolymerase series, T7 DNA polymerase, and Tth DNA polymerase. In someembodiments, the DNA polymerase is Bst DNA polymerase. In otherembodiments, the DNA polymerase is E. coli DNA polymerase I. In someembodiments, the DNA polymerase is the Klenow fragment of E. coli DNApolymerase. In some embodiments, the polymerase is Taq DNA polymerase.In some embodiments, the polymerase is T7 DNA polymerase.

In other embodiments, the DNA polymerase is a B family DNA polymeraseselected from the group consisting of Bst polymerase, Tli polymerase,Pfu polymerase, Pfu turbo polymerase, Pyrobest polymerase, Pwopolymerase, KOD polymerase, Sac polymerase, Sso polymerase, Pocpolymerase, Pab polymerase, Mth polymerase, Pho polymerase, ES4polymerase, VENT polymerase, DEEPVENT polymerase, Therminator™polymerase, phage Phi29 polymerase, and phage B103 polymerase. In someembodiments, the polymerase is KOD polymerase. In some embodiments, thepolymerase is Therminator™ polymerase. In some embodiments, thepolymerase is phage Phi29 DNA polymerase. In some embodiments thepolymerase is phage B103 polymerase, including, for example, thevariants disclosed in U.S. Patent Publication No. 20110014612 which isincorporated by reference herein in its entirety.

In other embodiments, the DNA polymerase is a mixed-type polymeraseselected from the group consisting of EX-Taq polymerase, LA-Taqpolymerase, Expand polymerase series, and Hi-Fi polymerase. In yet otherembodiments, the DNA polymerase is an unclassified DNA polymeraseselected from the group consisting of Tbr polymerase, Tfl polymerase,Tru polymerase, Tac polymerase, Tne polymerase, Tma polymerase, Tihpolymerase, and Tfi polymerase.

In other embodiments, the DNA polymerase is a reverse transcriptase (RT)polymerase selected from the group consisting of HIV reversetranscriptase, M-MLV reverse transcriptase and AMV reversetranscriptase. In some embodiments, the polymerase is HIV reversetranscriptase or a fragment thereof having DNA polymerase activityand/or primer extension activity.

Suitable bacterial DNA polymerases include without limitation E. coliDNA polymerases I, II and III, IV and V, the Klenow fragment of E. coliDNA polymerase, Clostridium stercorarium (Cst) DNA polymerase,Clostridium thermocellum (Cth) DNA polymerase, Bacillusstearothermophilus (Bst) DNA polymerase and Sulfolobus solfataricus(Sso) DNA polymerase.

Suitable eukaryotic DNA polymerases include without limitation the DNApolymerases α, δ, ε, η, ζ, γ, β, σ, λ, μ, and κ, as well as the Rev1polymerase (terminal deoxycytidyl transferase) and terminaldeoxynucleotidyl transferase (TdT).

Suitable viral and/or phage DNA polymerases include without limitationT4 DNA polymerase, T5 DNA polymerase, T7 DNA polymerase, Phi-15 DNApolymerase, Phi-29 DNA polymerase (see, e.g., U.S. Pat. No. 5,198,543;also referred to variously as Φ29 polymerase, phi29 polymerase, phi 29polymerase, Phi 29 polymerase, and Phi29 polymerase); Φ15 polymerase(also referred to herein as Phi-15 polymerase); Φ21 polymerase (Phi-21polymerase); PZA polymerase; PZE polymerase, PRD1 polymerase; Nfpolymerase; M2Y polymerase; SF5 polymerase; f1 DNA polymerase, Cp-1polymerase; Cp-5 polymerase; Cp-7 polymerase; PR4 polymerase; PR5polymerase; PR722 polymerase; L17 polymerase; M13 DNA polymerase, RB69DNA polymerase, G1 polymerase; GA-1 polymerase, BS32 polymerase; B103polymerase; a polymerase obtained from any phi-29 like phage orderivatives thereof, etc. See, e.g., U.S. Pat. No. 5,576,204, filed Feb.11, 1993; U.S. Pat. Appl. No. 2007/0196846, published Aug. 23, 2007.

Suitable archaeal DNA polymerases include without limitation thethermostable and/or thermophilic DNA polymerases such as, for example,DNA polymerases isolated from Thermus aquaticus (Taq) DNA polymerase,Thermus filiformis (Tfi) DNA polymerase, Thermococcus zilligi (Tzi) DNApolymerase, Thermus thermophilus (Tth) DNA polymerase, Thermus flavus(Tfl) DNA polymerase, Pyrococcus woesei (Pwo) DNA polymerase, Pyrococcusfuriosus (Pfu) DNA polymerase as well as Turbo Pfu DNA polymerase,Thermococcus litoralis (Tli) DNA polymerase or Vent DNA polymerase,Pyrococcus sp. GB-D polymerase, “Deep Vent” DNA polymerase, New EnglandBiolabs), Thermotoga maritima (Tma) DNA polymerase, Bacillusstearothermophilus (Bst) DNA polymerase, Pyrococcus Kodakaraensis (KOD)DNA polymerase, Pfx DNA polymerase, Thermococcus sp. JDF-3 (JDF-3) DNApolymerase, Thermococcus gorgonarius (Tgo) DNA polymerase, Thermococcusacidophilium DNA polymerase; Sulfolobus acidocaldarius DNA polymerase;Thermococcus sp. 9° N-7 DNA polymerase; Thermococcus sp. NA1;Pyrodictium occultum DNA polymerase; Methanococcus voltae DNApolymerase; Methanococcus thermoautotrophicum DNA polymerase;Methanococcus jannaschii DNA polymerase; Desulfurococcus strain TOK DNApolymerase (D. Tok Pol); Pyrococcus abyssi DNA polymerase; Pyrococcushorikoshii DNA polymerase; Pyrococcus islandicum DNA polymerase;Thermococcus fumicolans DNA polymerase; Aeropyrum pernix DNA polymerase;the heterodimeric DNA polymerase DP1/DP2, etc.

In some embodiments, the modified polymerase is an RNA polymerase.Suitable RNA polymerases include, without limitation, T3, T5, T7, andSP6 RNA polymerases.

In some embodiments, the polymerase is a reverse transcriptase (RT).Suitable reverse transcriptases include without limitation reversetranscriptases from HIV, HTLV-I, HTLV-II, FeLV, FIV, SIV, AMV, MMTV andMoMuLV, as well as the commercially available “Superscript” reversetranscriptases, (Life Technologies Corp., CA) and telomerases.

In some embodiments, the modified polymerase is derived from a known DNApolymerase. DNA polymerases have been classified into seven differentfamilies, based upon both amino acid sequence comparisons andthree-dimensional structure analyses. The DNA polymerase I (pol I) ortype A polymerase family includes the repair polymerases E. coli DNA polI, Thermus aquaticus pol I, and Bacillus stearothermophilus pol I,replicative DNA polymerases from some bacteriophages (T3, T5 and T7) andeukaryotic mitochondrial DNA polymerases. The DNA polymerase α (pol α)or type B polymerase family includes all eukaryotic replicating DNApolymerases as well as archaebacterial DNA polymerases, viral DNApolymerases, DNA polymerases encoded in mitochondrial plasmids ofvarious fungi and plants, and the polymerases from bacteriophages T4 andRB69. Family C polymerases are the primary bacterial chromosomereplicative enzymes. These are sometimes considered a subset of familyY, which contains the eukaryotic polymerase pol β, as well as othereukaryotic polymerases such as pol σ, pol λ, pol μ, and terminaldeoxynucleotidyl transferase (TdT). Family D polymerases are all foundin the Euryarchaeota subdomain of Archaea and are thought to bereplicative polymerases. The family Y polymerases are called translesionsynthesis (TLS) polymerases due to their ability to replicate throughdamaged DNA. They are also known as error-prone polymerases since theyhave a low fidelity on undamaged templates. This family includes Pol η,Pol ζ, Pol ι(iota), Pol κ (kappa), and Rev1, and Pol IV and PolV from Ecoli. Finally, the reverse transcriptase family includes reversetranscriptases from retroviruses and eukaryotic polymerases, usuallyrestricted to telomerases. These polymerases use an RNA template tosynthesize the DNA strand, and are also known as RNA-dependent DNApolymerases.

In some embodiments, a modified polymerase or biologically activefragment thereof can be prepared using any suitable method or assayknown to one of skill in the art. In some embodiments, any suitablemethod of protein engineering to obtain a modified polymerase orbiologically active fragment thereof is encompassed within the scope ofthe present disclosure. For example, site-directed mutagenesis is atechnique that can be used to introduce one or more known or randommutations within a DNA construct. The introduction of the one or moreamino acid mutations can be verified for example, against a standard orreference polymerase or via nucleic acid sequencing. Once verified, theconstruct containing the one or more of the amino acid mutations can betransformed into bacterial cells and expressed.

Typically, colonies containing mutant expression constructs areinoculated in media, induced, and grown to a desired optical densitybefore collection (often via centrifugation) and purification of thesupernatant. It will be readily apparent to the skilled artisan that thesupernatant can be purified by any suitable means. Typically, a columnfor analytical or preparative protein purification is selected. In someembodiments, a modified polymerase or biologically active fragmentthereof prepared using the methods can be purified, without limitation,over a heparin column essentially according to the manufacturer'sinstructions.

Once purified, the modified polymerase or biologically active fragmentthereof can be assessed using any suitable method for various polymeraseactivities, properties, or characteristics. In some embodiments, thepolymerase activity, property, or characteristic being assessed willdepend on the application of interest. For example, a polymerase used toamplify or sequence a nucleic acid molecule of about 300 to about 600 bpin length can be analyzed for properties such as increased processivityand/or increased read length relative to a reference polymerase lackingthe one or more amino acid modifications (e.g., a substitution, deletionor addition). In another example, an application requiring deeptargeted-resequencing of a nucleic acid molecule of about 100 bp inlength may include polymerase properties such as increased raw accuracy,increased total sequencing throughout, decreased strand bias or reducedsystematic error. In some embodiments, the one or more polymeraseproperties assessed can be related to polymerase performance orpolymerase activity in the presence of a high ionic strength solutionsuch as at least 120 mM salt.

In some embodiments, a modified polymerase or biologically activefragment thereof prepared according to the methods disclosed herein canbe assessed for DNA binding activity, nucleotide polymerizationactivity, primer extension activity, strand displacement activity,reverse transcriptase activity, 3′-5′ exonuclease (proofreading)activity, and the like.

In some embodiments, a modified polymerase or biologically activefragment thereof prepared according to the methods can be assessed forincreased accuracy, increased processivity, increased average readlength, increased minimum read length, increased total sequencingthroughput, reduced strand bias, reduced systematic error, increasedAQ20, increased 200Q17 value or the ability to perform nucleotidepolymerization as compared to a reference polymerase under the sameconditions. In some embodiments, the modified polymerase or thebiologically active fragment thereof can be assessed for any one of thepolymerase activities in the presence of a high ionic strength solution(e.g., a salt solution having at least 120 mM salt such as NaCl and/orKCl.

In some embodiments, the modified polymerase or biologically activefragment thereof is optionally characterized by a change (e.g., increaseor decrease) in any one or more of the following properties (often,relative to a polymerase lacking the corresponding one or more aminoacid mutations): dissociation time constant, rate of dissociation ofpolymerase from a given nucleic acid template, binding affinity of thepolymerase for a given nucleic acid template, as well as for propertiesthat are associated with a nucleic acid sequencing reactions such asaverage read length, minimum read length, accuracy, total number ofperfect reads, total sequencing throughput, strand bias, systematicerror, fold-increase in throughput of a sequencing reaction, performancein salt (i.e., ionic strength), AQ20, average error-free read length,error-rate, 100Q17 value, 200Q17 value, Q score, raw read accuracy, andprocessivity. It will be understood that in illustrative embodiments ofthe present invention, the modified polymerase is used in an emulsionPCR reaction to amplify templates as part of a sequencing workflow, forexample to amplify templates on a solid support, and in someillustrative embodiments, to clonally amplify templates on a solidsupport. Methods for making emulions and performing emulsion PCR areknown in the art. Compounds for making emulsions such as biocompatibleoils and emulsion stabilizers are available commercially (e.g. Sigma,St. Louis Mo.; Unigema, New Jersey). The nucleic acid sequence of atleast a portion of the amplified templates is then determined. Theresults of this sequence determination are compared to results ofsimilar experiments performed with a reference polymerase, such as Taqpolymerase (SEQ ID NO:1) or the modified Taq polymerase of SEQ ID:34,for an emulsion PCR template amplification step. The examples providedherein demonstrate the performance of a specific example of suchcomparative test. A library of nucleic acid molecules can be amplifiedonto Ion Sphere™ particles (Ion Torrent Systems, Part No. 602-1075-01)essentially according to the protocols provided in the User Guide forthe Ion Xpress™ Template Kit v 2.0 (Ion Torrent Systems, Part No.4469004A) and using the reagents provided in the Ion TemplatePreparation Kit (Ion Torrent Systems/Life Technologies, Part No.4466461), the Ion Template Reagents Kit (Ion Torrent Systems/LifeTechnologies, Part No. 4466462) and the Ion Template Solutions Kit (IonTorrent Systems/Life Technologies, Part No. 4466463), except that anon-test or reference polymerase can be used in place of the polymeraseprovided in the kit and the results of the on-test polymerase can becompared to those generated with the reference polymerase. The amplifiednucleic acid molecules are then loaded into a PGM™ 314 sequencing chip.The chip is loaded into an Ion Torrent PGM™ Sequencing system (IonTorrent Systems/Life Technologies, Part No. 4462917) and sequencedessentially according to the protocols provided in User Guide for theIon Sequencing Kit v2.0 (Ion Torrent Systems/Life Technologies, Part No.4469714 Rev A) and using the reagents provided in the Ion Sequencing Kitv2.0 (Ion Torrent Systems/Life Technologies, Part No. 4466456) and theIon Chip Kit (Ion Torrent Systems/Life Technologies, Part No. 4462923).

In some embodiments, the modified polymerase or biologically activefragment thereof can be assessed individually with respect to knownvalues in the art for an analogous polymerase. In some embodiments, amodified polymerase or biologically active fragment thereof preparedaccording to the methods disclosed herein can be assessed against aknown or reference polymerase under similar or identical conditions. Insome embodiments, the conditions can include amplifying or sequencing anucleic acid molecule in the presence of a high ionic strength solution.

In some embodiments, the disclosure relates generally to methods forproducing a plurality of modified polymerases or biologically activefragments. In some embodiments, the disclosure relates generally tomethods for producing a plurality of modified polymerases orbiologically active fragments using a high-throughput or automatedsystem. In some embodiments, the methods comprise mixing a plurality ofmodified polymerases or biologically active fragments with a series ofreagents necessary for protein purification and extracting the purifiedpolymerases or biologically active fragments from the mixture. In oneexample, a plurality of random or site-directed mutagenesis reactionscan be prepared in a 96- or 384-well plate. Optionally, the contents ofthe 96- or 384-well plate can undergo an initial screen to identifypolymerase mutant constructs. The contents of each individual well (orthe contents of each well from an initial screen) can be delivered to aseries of flasks, tubes or shakers for inoculation and induction. Onceat the required optical density, the flask, tubes or shakers can becentrifuged and the supernatants recovered. Each supernatant can undergoprotein purification, for example via fully automated columnpurification (for example see, Camper and Viola, AnalyticalBiochemistry, 2009, p 176-181). The purified modified polymerases orbiologically active fragments can be assessed for one, or a combinationof polymerase activities, such as DNA binding, primer extension, stranddisplacement, reverse transcriptase activity, and the like. It isenvisaged that the skilled artisan can use the method (or variations ofthe methods that are within the scope of the disclosure) to identify aplurality of modified polymerases or biologically active fragments. Insome aspects, the methods can be used to identify a plurality ofmodified polymerases or biologically active fragments having enhancedaccuracy as compared to a reference polymerase under the sameconditions. In some embodiments, the methods can be used to identify aplurality of modified polymerases or biologically active fragmentsthereof having enhanced accuracy in the presence of a high ionicstrength solution. In some aspects, the methods can be used to identifya plurality of modified polymerases or biologically active fragmentshaving enhanced read length as compared to a reference polymerase underthe same conditions. In some embodiments, the methods can be used toidentify a plurality of modified polymerases or biologically activefragments thereof having enhanced read length in the presence of a highionic strength solution. In some aspects, the methods can be used toidentify a plurality of modified polymerases or biologically activefragments having enhanced thermostability as compared to a referencepolymerase under the same conditions. In some embodiments, the methodscan be used to identify a plurality of modified polymerases orbiologically active fragments thereof having enhanced thermostability inthe presence of a high ionic strength solution. In some aspects, themethods can be used to identify a plurality of modified polymerases orbiologically active fragments having reduced strand bias and/or reducedsystematic error as compared to a reference polymerase under the sameconditions. In some embodiments, the methods can be used to identify aplurality of modified polymerases or biologically active fragmentsthereof having reduced strand bias and/or reduced systematic error inthe presence of a high ionic strength solution. In some embodiments, thehigh ionic strength solution can include a KCl and/or NaCl salt. In someembodiments, the high ionic strength solution can be at least 120 mMsalt. In some embodiments, the high ionic strength solution can be from125 mM to 200 mM salt. In some embodiments, the high ionic strengthsolution can be about 130 mM, 150 mM, 200 mM, 225 mM, 250 mM, 275 mM,300 mM, 350 mM, 400 mM, 450 mM, 500 mM, or greater salt concentration.In some embodiments, the high ionic strength solution can be about 125mM to about 400 mM salt. In some embodiments, the high ionic strengthsolution can be about 150 mM to about 275 mM salt. In some embodiments,the high ionic strength solution can be about 200 mM to about 250 mMsalt. It will be apparent to the skilled artisan that various othersuitable salts can be used in place, or in combination with KCl and/orNaCl. In some embodiments, the ionic strength solution can furtherinclude a sulfate.

As will be readily apparent to the skilled artisan, the disclosureoutlines an exemplary automated and high-throughput method to generate alibrary of modified polymerases or biological active fragments. Thedisclosure also outlines methods to assess such modified polymerases orbiologically active fragments for polymerase activities. It is alsoencompassed by the disclosure that the skilled artisan can readilyproduce a mutagenized library of constructs where every amino acid ofthe polymerase of interest can be mutated. In some embodiments, amutagenized library can be prepared wherein each amino acid residuewithin the polymerase is mutated by every possible amino acidcombination. In some embodiments, a mutagenized library can be preparedwhere each amino acid residue within the polymerase is mutated, andwhere the combination of possible amino acid mutations is limited toconservative or non-conservative amino acid substitutions. In bothexamples, mutagenized libraries can be created containing vast numbersof mutant constructs that can be applied through an automated orhigh-throughput system for purification or for initial screening. Insome embodiments, plates of 96- or 384-library constructs representing amutagenized library can be assessed for one or more polymeraseactivities using an ISFET based sequencing polymerase screen, using anext generation (i.e. high-throughput) platform (e.g., Ion TorrentSystems Personal Genome Machine and a Ion based ISFET Sequencing Chip(Life Technologies Corp, CA). In one example, the polymerase screen caninclude one or more 96- or 384-plates representing a mutagenizedlibrary; where each well of the plate consists of a different construct(modified polymerase) containing at least one, or more, amino acidmutations as compared to a reference polymerase in at least one well onthe same plate (lacking the at least one or more amino acid mutations).In some embodiments, the reference polymerase acts as a control samplewithin the 96- or 384-plate to assess polymerase activity of eachmodified polymerase within the wells of the same plate. In someembodiments, the library of constructs and reference polymerase withinthe plate can further include a unique barcode for each modifiedpolymerase within the plate. Thus, a 96-well plate may contain 96barcodes if each well in the plate contains either a referencepolymerase or a modified polymerase construct. Once purified, themutagenized library of proteins can be assessed for one, or acombination of polymerase activities, such as DNA binding, primerextension, strand displacement, reverse transcriptase, nick-initiatedpolymerase activity, raw accuracy, increase total sequencing throughput,reduced strand bias, lowered systematic error, increased read length,increased processivity, increased thermostability, and the like. In someembodiments, the template libraries can further include templatelibraries that are known to perform well under the proposedamplification conditions, so that the well-performing template librariescan act as a baseline or control reading.

Optionally, the purified modified polymerases or biologically activefragments thereof can be further assessed for other properties such asthe ability to amplify or sequence a nucleic acid molecule in thepresence of high salt. The source or origin of the polymerase to bemutated is generally not considered critical. For example, eukaryotic,prokaryotic, archaeal, bacterial, phage or viral polymerases can be usedin the methods. In some embodiments, the polymerase can be a DNA or RNApolymerase. In some embodiments, the DNA polymerase can include a familyA or family B polymerase. In some embodiments, the DNA polymerase caninclude a thermostable DNA polymerase. The exemplary methods providedherein are to be considered illustrative in view of the field of proteinengineering and enzymatics and should not be construed as in any waylimiting.

In some embodiments, the modified polymerase or a biologically activefragment thereof, includes one or more amino acid mutations that arelocated inside the catalytic domain of the modified polymerase. In someembodiments, the modified polymerase or biologically active fragmentthereof can include at least 25, 50, 75, 100, 150, or more amino acidresidues of the catalytic domain. In some embodiments, the modifiedpolymerase or biologically active fragment thereof can include any partof the catalytic domain that comprises at least 25, 50, 75, 100, 150, ormore contiguous amino acid residues. In some embodiments, the modifiedpolymerase or biologically active fragment thereof can include at least25 contiguous amino acid residues of the catalytic domain and canoptionally include one or more amino acid residues at the C-terminal orthe N-terminal that are outside the catalytic domain. In someembodiments, the modified polymerase or a biologically active fragmentcan include any 25, 50, 75, 100, 150, or more contiguous amino acidresidues of the catalytic domain coupled to any one or morenon-catalytic domain amino acid residues.

In some embodiments, the modified polymerase (or biologically activefragment thereof) includes one or more amino acid mutations that arelocated inside the catalytic domain of the modified polymerase, andwherein the polymerase has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%,99%, or more identity to any one of SEQ ID NO: 2, SEQ ID NO: 3, SEQ IDNO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ IDNO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18,SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO:23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ IDNO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 andSEQ ID NO:33.

In some embodiments, the modified polymerase or biologically activefragment thereof includes at least 25 or 50 contiguous amino acidresidues of the catalytic domain and has at least 80% identity to anyone of SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ IDNO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ IDNO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20,SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO:25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ IDNO: 30, SEQ ID NO: 31, SEQ ID NO: 32 and SEQ ID NO:33.

In some embodiments, the modified polymerase or biologically activefragment thereof includes at least 75 contiguous amino acid residues ofthe catalytic domain and has at least 85% identity to any one of SEQ IDNO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 and SEQ ID NO:33.

In some embodiments, the modified polymerase or biologically activefragment thereof includes at least 25 or 50 contiguous amino acidresidues of the catalytic domain and has at least 90% identity to anyone of SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ IDNO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ IDNO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20,SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO:25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ IDNO: 30, SEQ ID NO: 31, SEQ ID NO: 32 and SEQ ID NO:33.

In some embodiments, the modified polymerase or biologically activefragment thereof includes at least 25 contiguous amino acid residues ofthe catalytic domain and has at least 95% identity to any one of SEQ IDNO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 and SEQ ID NO:33.

In some embodiments, the modified polymerase or biologically activefragment thereof includes at least 25 or 50 contiguous amino acidresidues of the catalytic domain and has at least 98% identity to anyone of SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ IDNO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ IDNO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20,SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO:25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ IDNO: 30, SEQ ID NO: 31, SEQ ID NO: 32 and SEQ ID NO:33.

In some embodiments, the modified polymerase or biologically activefragment thereof includes at least 25 or 50 contiguous amino acidresidues of the catalytic domain and has at least 99% identity to anyone of SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ IDNO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ IDNO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20,SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO:25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ IDNO: 30, SEQ ID NO: 31, SEQ ID NO: 32 and SEQ ID NO:33.

In some embodiments, the modified polymerase or a biologically activefragment thereof, includes one or more amino acid mutations that arelocated inside the DNA binding domain of the polymerase. In someembodiments, the modified polymerase or biologically active fragmentthereof can include at least 25, 50, 75, 100, 150, or more amino acidresidues of the DNA binding domain of the modified polymerase. In someembodiments, the modified polymerase or biologically active fragmentthereof can include any part of the DNA binding domain that comprises atleast 25, 50, 75, 100, 150, or more contiguous amino acid residues. Insome embodiments, the modified polymerase or biologically activefragment thereof can include at least 25 contiguous amino acid residuesof the binding domain and can optionally include one or more amino acidresidues at the C-terminal or the N-terminal that are outside of thebinding domain. In some embodiments, the modified polymerase or abiologically active fragment can include any 25, 50, 75, 100, 150 ormore contiguous amino acid residues of the binding domain coupled to anyone or more non-binding domain amino acid residues. In some embodiments,the modified polymerase (or biologically active fragment thereof)includes one or more amino acid mutations that are located inside theDNA binding domain of the modified polymerase, and wherein thepolymerase has at least 80%, 85%, 90%, 95%, 96%, 97%, 98%, 99%, or moreidentity to any one of SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ IDNO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ IDNO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19,SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO:24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ IDNO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 and SEQ ID NO:33.

In some embodiments, the modified polymerase or biologically activefragment thereof includes at least 25 contiguous amino acid residues ofthe DNA binding domain and has at least 80% identity to any one of SEQID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 and SEQ ID NO:33.

In some embodiments, the modified polymerase or biologically activefragment thereof includes at least 25 contiguous amino acid residues ofthe DNA binding domain and has at least 85% identity to any one of SEQID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 and SEQ ID NO:33.

In some embodiments, the modified polymerase or biologically activefragment thereof includes at least 25 contiguous amino acid residues ofthe DNA binding domain and has at least 90% identity to any one of SEQID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 and SEQ ID NO:33.

In some embodiments, the modified polymerase or biologically activefragment thereof includes at least 25 contiguous amino acid residues ofthe DNA binding domain and has at least 95% identity to any one of SEQID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 and SEQ ID NO:33.

In some embodiments, the modified polymerase or biologically activefragment thereof includes at least 25 contiguous amino acid residues ofthe DNA binding domain and has at least 98% identity to any one of SEQID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 and SEQ ID NO:33. In some embodiments, themodified polymerase or biologically active fragment thereof includes atleast 50 contiguous amino acid residues of the DNA binding domain andhas at least 80%, 85%, 90%, 95%, 98%, or 99% identity to any one of SEQID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 and SEQ ID NO:33.

In some embodiments, the modified polymerase or a biologically activefragment thereof, includes one or more amino acid mutations that arelocated outside the catalytic domain (also referred to herein as the DNAbinding cleft) of the polymerase. The catalytic domains of the A familyDNA polymerases, B family DNA polymerases and reverse transcriptases, aswell as the RNA-dependent RNA polymerases are well known; all share acommon overall structure and catalytic mechanism. The catalytic domainsof all these polymerases have a shape that has been compared to a righthand and consists of “palm”, “thumb” and “finger” domains. The palmdomain typically contains the catalytic site for the phosphoryl transferreaction. The thumb is thought to play a role positioning the duplex DNAand in processivity and translocation. The fingers interact with theincoming nucleotide as well as the template base with which it ispaired. The palm domains are homologous in the A, B and RT families, butthe arrangements of the fingers and thumb are different. The thumbdomains of the different polymerase families do share common features,containing parallel or anti-parallel α-helices, with at least oneα-helix interacting with the minor groove of the primer-templatecomplex. The fingers domain also conserves an α-helix positioned at theblunt end of the primer-template complex. This helix contains highlyconserved side chains (the B motif).

Three conserved motifs, A, B, and C have been identified for the Afamily polymerases. The A and C motifs are typically conserved in boththe B family polymerases and the RT polymerases. (Delarue et al.,Protein Engineering 3: 461-467 (1990)).

In some embodiments, for the A family polymerases, the A motif comprisesthe consensus sequence:

(SEQ ID NO: 35) a. DXSXXE.

In some embodiments, for the A family polymerases, the B motif comprisesthe consensus sequence:

(SEQ ID NO: 36)  a. KXXXXXXYG

In some embodiments, for the A family polymerases, the C motif comprisesthe consensus sequence:

(SEQ ID NO: 37) a. VHDE

In some embodiments, the polymerase optionally comprises any A familypolymerase, or biologically active fragment, mutant, variant ortruncation thereof, wherein the linking moiety is linked to any aminoacid residue of the A family polymerase, or biologically active fragmentmutant, variant or truncation thereof, that is situated outside the A, Bor C motifs. In some embodiments, the linking moiety is linked to anyamino acid residue of the A family polymerase, or biologically activefragment, that is situated outside the A motif, the B motif or the Cmotif.

The A and C motifs typically form part of the palm domain, and eachmotif typically contains a strictly conserved aspartic acid residue,which are involved in the catalytic mechanism common to all the DNApolymerases. DNA synthesis can be mediated by transfer of a phosphorylgroup from the incoming nucleotide to the 3′ OH of the DNA, releasing apolyphosphate moiety and forming a new DNA phosphodiester bond. Thisreaction is typically catalyzed by a mechanism involving two metal ions,normally Mg²⁺, and the two conserved aspartic acid residues.

In some embodiments, the conserved glutamic acid residue in motif A ofthe A family DNA polymerases plays an important role in incorporation ofthe correct nucleotide, as does the corresponding conserved tyrosine inB family members (Minnick et al., Proc. Natl. Acad. Sci. USA 99:1194-1199 (2002); Parsell et al, Nucleic Acids Res. 35: 3076-3086(2002). Mutations at the conserved Leu of motif A affect replicationfidelity (Venkatesan et al., J. Biol. Chem. 281: 4486-4494 (2006)).

In some embodiments, the B motif contains conserved lysine, tyrosine andglycine residues. The B motif of E coli pol I has been shown to bindnucleotide substrates and contains a conserved tyrosine which has beenshown to be in the active site.

In some embodiments, for the B family polymerases, the A motif comprisesthe consensus sequence:

(SEQ ID NO: 38) DXXSLYPS.

In some embodiments, for the B family polymerases, the B motif comprisesthe consensus sequence:

(SEQ ID NO: 39)  KXXXNSXYG

In some embodiments, for the B family polymerases, the C motif comprisesthe consensus sequence:

(SEQ ID NO: 40) a. YGDTDS

The residues in bold indicate invariant residues.

In some embodiments, the modified polymerase optionally comprises any Bfamily polymerase, or biologically active fragment, mutant, variant ortruncation thereof, wherein the linking moiety is linked to any aminoacid residue of the B family polymerase, or biologically activefragment, mutant, variant or truncation thereof that is situated outsidethe A, B or C motifs. In some embodiments, the linking moiety is linkedto any amino acid residue of the B family polymerase, or biologicallyactive fragment, that is situated outside the A motif, the B motif orthe C motif.

In some embodiments, the B family polymerases contain six conservedmotifs, of which regions I and II correspond to the A and C motifs ofthe A family. Region III is involved in nucleotide binding and isfunctionally homologous to motif B. Regions I, II and III converge atthe center of the active site from the palm (I), the fingers (II), andbase of the thumb (III) to produce a contiguous conserved surface.Within these regions, a set of highly conserved residues form threechemically distinct clusters consisting of exposed aromatic residues,negatively charged residues, and positively charged residues,respectively. For example, in the replication polymerase of thebacteriophage RB69, these three clusters correspond to the followingamino acid residues: Y416, Y567, and Y391 (exposed aromatic residues),D621, D623, D411, D684, and E686 (negatively charged residues), andK560, R482, and K486 (positively charged residues). See Wang et al, Cell89: 1087-1099 (1997). These three clusters typically encompass theregion in which the primer terminus and the incoming nucleotide would beexpected to bind. In some embodiments, the modified polymeraseoptionally comprises any B family polymerase, or biologically activefragment, mutant, variant or truncation thereof, wherein the linkingmoiety is linked to any amino acid residue of the B family polymerase,or biologically active fragment, mutant, variant or truncation thereofthat is situated outside one or more of these conserved amino acidclusters or motifs. In some embodiments, the linking moiety is linked toany amino acid residue of the B family polymerase, or biologicallyactive fragment, mutant, variant or truncation thereof that is situatedoutside any of these conserved amino acid clusters or motifs.

The RT polymerases contain four conserved sequence motifs (Poch et al.,EMBO J. 12: 3867-3874 (1989)), with motifs A and C containing theconserved catalytic aspartates. The integrity of motif B is alsorequired for reverse transcriptase function.

The consensus sequence for motif A is DXXXXF/Y (SEQ ID NO: 41)

The consensus sequence for motif B is FXGXXXS/A (SEQ ID NO: 42)

The consensus sequence for motif C is YXDD (SEQ ID NO: 43)

The consensus sequence for motif D is GXXXXXXXK (SEQ ID NO: 44).

Mutations in the YXDD motif (motif C), the most highly conserved ofthese motifs, can abolish polymerase activity and alter the processivityand fidelity (Sharma et al., Antiviral Chemistry and Chemotherapy 16:169-182 (2005)). In addition, the conserved lysine residue in motif D, aloop that is unique to the RT polymerases, is an invariant residueimportant for nucleotide binding (Canard et al., J. Biol. Chem. 274:35768-35776 (1999)).

In some embodiments, the modified polymerase optionally comprises any RTpolymerase, or biologically active fragment, mutant, variant ortruncation thereof, wherein the linking moiety is linked to any aminoacid residue of the RT polymerase, or biologically active fragment,mutant, variant or truncation thereof that is situated outside one ormore of the A, B, C and D motifs. In some embodiments, the linkingmoiety is linked to any amino acid residue of the RT polymerase, orbiologically active fragment, mutant, variant or truncation thereof thatis situated outside any of these motifs.

In some embodiments, the modified polymerase includes one or moremodifications (including amino acid substitutions, deletions, additionsor chemical modifications) located at any position other than at theconserved or invariant residues.

In some embodiments, the modified polymerase or biologically activefragment thereof includes at least 25, 50, 75, or 100 contiguous aminoacid residues having at least 80% identity to any one of SEQ ID NO: 2,SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7,SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12,SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO:17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ IDNO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31,SEQ ID NO: 32 and SEQ ID NO:33.

In some embodiments, the modified polymerase or biologically activefragment thereof includes at least 50, 75, 100, 150, 175, 200 contiguousamino acid residues having at least 85% identity to any one of SEQ IDNO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 and SEQ ID NO:33.

In some embodiments, the modified polymerase or biologically activefragment thereof includes at least 225, 250, 275, 300, 325, 350, 375,400 contiguous amino acid residues having at least 85% identity to anyone of SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ IDNO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ IDNO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20,SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO:25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ IDNO: 30, SEQ ID NO: 31, SEQ ID NO: 32 and SEQ ID NO:33.

In some embodiments, the modified polymerase or biologically activefragment thereof includes at least 50, 75, 100, 150, 200, 250, 300, 350,400, 450, 500, or more contiguous amino acid residues having at least90% identity to any one of SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ IDNO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19,SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO:24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ IDNO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 and SEQ ID NO:33.

In some embodiments, the modified polymerase or biologically activefragment thereof includes at least 100, 200, 300, 400, 500, 600, 700, ormore contiguous amino acid residues having at least 95% identity to anyone of SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ IDNO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ IDNO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20,SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO:25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ IDNO: 30, SEQ ID NO: 31, SEQ ID NO: 32 and SEQ ID NO:33.

In some embodiments, the modified polymerase or biologically activefragment thereof includes at least 100 contiguous amino acid residueshaving at least 98% identity to any one of SEQ ID NO: 2, SEQ ID NO: 3,SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8,SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO:13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ IDNO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27,SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO:32 and SEQ ID NO:33.

In some embodiments, the modified polymerase or biologically activefragment thereof includes at least 150 contiguous amino acid residueshaving at least 99% identity to any one of SEQ ID NO: 2, SEQ ID NO: 3,SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8,SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO:13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ IDNO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27,SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO:32 and SEQ ID NO:33.

In some embodiments, the modified polymerase or biologically activefragment thereof includes at least 200 contiguous amino acid residueshaving at least 99% identity to any one of SEQ ID NO: 2, SEQ ID NO: 3,SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8,SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO:13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ IDNO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27,SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO:32 and SEQ ID NO:33.

In some embodiments, the modified polymerase or biologically activefragment thereof includes at least 400 contiguous amino acid residueshaving at least 99% identity to any one of SEQ ID NO: 2, SEQ ID NO: 3,SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8,SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO:13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ IDNO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27,SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO:32 and SEQ ID NO:33.

In some embodiments, in addition to the polymerase domains, the modifiedpolymerase can include one or more additional functional domains,including domains required for 3′->5′ (reverse) exonuclease activitythat mediates proofreading of the newly synthesized DNA strand, or for5′->3′ (forward) exonuclease activity that mediates nick translationduring DNA repair, or for FLAP endonuclease activity. In someembodiments, the modified polymerase has strand-displacing activity, andcan catalyze nucleic acid synthesis by polymerizing nucleotides into the3′ end of a nick within a double stranded nucleic acid template whilesimultaneously displacing the nucleic acid located downstream of thenick. It will be appreciated to one of skill in the art, that a modifiedpolymerase as encompassed by the present disclosure optionally has anyone or more of these activities as well.

The 3′ to 5′ exonuclease proofreading domains of both A and B family DNApolymerases contain three conserved motifs, called Exo I, Exo II and ExoIII, each of which contains an invariant aspartic acid residue essentialfor metal binding and exonuclease function. Alterations of theseconserved aspartic acid residues result in proteins which retainpolymerase activity, but are deficient in exonuclease activity (Hall etal., J. Gen. Virol. 76: 2999-3008 (1995)). Conserved motifs in the 5′ to3′ exonuclease domains and amino acid alterations that affectexonuclease activity have also been identified (U.S. Pat. No.5,466,591).

Representative examples of A family enzymes are E. coli. Pol I, or theKlenow fragment of E coli. Pol I, Bst DNA polymerase, Taq DNApolymerase, T7 DNA polymerase and Tth DNA polymerase. A family enzymesalso include the Platinum Taq DNA polymerase series.

In some embodiments, the A family enzymes are characterized by high DNAelongation rates but can have poor fidelity because of the lack of 3′-5′exonuclease activity. In some embodiments, the B family enzymes can havehigh fidelity owing to their 3′-5′ exonuclease activity but can achievelow DNA elongation rates.

Other types of polymerases include, for example, Tbr polymerase, Tflpolymerase, Tru polymerase, Tac polymerase, Tne polymerase, Tmapolymerase, Tih polymerase, Tfi polymerase and the like. RT polymerasesinclude HIV reverse transcriptase, Moloney Murine Leukemia Virus (M-MLV)reverse transcriptase, Avian Myeloblastosis Virus (AMV) reversetranscriptase or Rous Sarcoma Virus (RSV) reverse transcriptase.Variants, modified products and derivatives thereof are also usable.Similarly, Taq, Platinum Taq, Tth, Tli, Pfu, Pfutubo, Pyrobest, Pwo andKOD, VENT, DEEPVENT, EX-Taq, LA-Taq, Therminator™ the Expand series andPlatinum Taq Hi-Fi are all commercially available. Other enzymes can bereadily isolated from specific bacteria by those of ordinary skill inthe art.

One exemplary polymerase, E coli DNA polymerase I (“Pol I”) possessesthree enzymatic activities: a 5′ to 3′ DNA polymerase activity; a 3′ to5′ exonuclease activity that mediates proofreading; and a 5′ to 3′exonuclease activity mediating nick translation during DNA repair. TheKlenow fragment is a large protein fragment produced when E. coli Pol Iis proteolytically cleaved by subtilisin. It retains the polymerase andproofreading exonuclease activities, but lacks the 5′ to 3′ exonucleaseactivity. An exo-Klenow fragment which has been mutated to remove theproofreading exonuclease activity is also available. The structure ofthe Klenow fragment shows that highly conserved residues that interactwith DNA include N675, N678, K635, R631, E611, T609, R835, D827, S562and N579 (Beese et al, Science 260: 352-355 (1993)).

Arg682 in the Klenow fragment of E. coli DNA polymerase I (pol I) isimportant for the template-dependent nucleotide-binding function, andappears to maintain high processivity of the DNA polymerase (Pandey etal., European Journal of Biochemistry, 214:59-65 (1993)).

In some embodiments, the modified polymerase can be derived from Taq DNApolymerase, which is an A family DNA polymerase derived from thethermophilic bacterium Thermus aquaticus. It is best known for its usein the polymerase chain reaction. Taq polymerase lacks a proofreadingactivity, and thus has a relatively low replication fidelity (Kim etal., Nature 376: 612-616 (2002).

In some embodiments, the polymerase can be derived from the T7 DNApolymerase of bacteriophage T7, which is an A family DNA polymerase thatconsists of a 1:1 complex of the viral T7 gene 5 protein (80 k Da) andthe E. coli thioredoxin (12 k Da). It lacks a 5′->3′ exonuclease domain,but the 3′->5′ exonuclease activity is approximately 1000-fold greaterthan that of E coli Klenow fragment. The exonuclease activity appears tobe responsible for the high fidelity of this enzyme and prevents stranddisplacement synthesis. This polymerase typically exhibits high levelsof processivity.

In some embodiments, the polymerase can be derived from KOD DNApolymerase, which is a B family DNA polymerase derived from Thermococcuskodakaraensis. KOD polymerase is a thermostable DNA polymerase with highfidelity and processivity.

In some embodiments, the polymerase can be derived from theTherminator™™ DNA polymerase, which is also a B family DNA polymerase.Therminator™ is an A485L point mutation of the DNA polymerase fromThermococcus species 9oN-7 (Ichida et al., Nucleic Acids Res. 33:5214-5222 (2005)). Therminator™ polymerase has an enhanced ability toincorporate modified substrates such as dideoxynucleotides,ribonucleotides, and acyclonucleotides.

In some embodiments, the polymerase can be derived from a Phi29polymerase or a Phi29-type polymerase, for example a polymerase derivedfrom the bacteriophage B103. The Phi29 and B103 DNA polymerases are Bfamily polymerases from related bacteriophages. In addition to the A, Band C motifs, the Phi29 family of DNA polymerases contain an additionalconserved motif, KXY in region Y (Blanco et al., J. Biol. Chem. 268:16763-16770 (1993). Mutations to Phi29 and B103 polymerases that affectpolymerase activity and nucleotide binding affinity are described inU.S. Patent Publication No. 20110014612 and its priority documents U.S.Provisional Application Nos. 61/307,356; 61/299,917; 61/299,919;61/293,616; 61/293,618; 61/289,388; 61/263,974; 61/245,457; 61/242,771;61/184,770; and 61/164,324, herein incorporated by reference in theirentireties.

In some embodiments, the polymerase is derived from the reversetranscriptase from human immunodeficiency virus type 1 (HIV-1), which isa heterodimer consisting of one 66-kDa and one 51-kDa subunit. The p66subunit contains both a polymerase and an RNase H domain; proteolyticcleavage of p66 removes the RNase H domain to yield the p51 subunit(Wang et al., PNAS 91:7242-7246 (1994)). The structure of the HIV-1reverse transcriptase shows multiple interactions between the 2′-OHgroups of the RNA template and the reverse transcriptase. ResiduesSer280 and Arg284 of helix I in the p66 thumb are involved in the RNA-RTinteractions, as well as residues Glu89 and Gln91 of the template gripin the p66 palm. The p51 subunit also plays a role in the interactionsbetween the RNA-DNA duplex and the RT, with residues Lys395, Glu396,Lys22 and Lys390 of the p51 subunit also interacting with the DNA:RNAduplex (Kohlstaedt et al, Science 256: 1783-1790 (1992) and Safarianoset al, The EMBO Journal 20:1449-1461 (2001)).

In some embodiments, the polymerase is derived from the Bst DNApolymerase of Bacillus stearothermophilus, or any biologically activefragment thereof. The Bst polymerase can be a family A DNA polymerase.The large fragment of the naturally occurring Bst DNA polymerase isequivalent to the Klenow fragment of E. coli Pol I, retaining thepolymerase and proofreading exonuclease activities while lacking the 5′to 3′ exonuclease activity. In some embodiments, the polymerase derivedfrom Bst DNA polymerase can lack 3′ to 5′ exonuclease activity. As usedherein, the term “Bst DNA polymerase” may refer to a full length Bstprotein or to a Bst large fragment.

In some embodiments, the modified polymerase consists of or comprises anisolated variant of a polymerase having or comprising an amino acidsequence that is at least 80% identical to the amino acid sequence of awild type full length or wild type large fragment Bst DNA polymerase. Insome embodiments, the modified polymerase is an isolated variant of aBst DNA polymerase comprising a variant having an amino acid sequencethat is at least 85%, at least 90%, at least 91%, at least 92%, at least93%, at least 94%, at least 95%, at least 96%, at least 97%, at least98%, or at least 99% identical to the amino acid sequence of wild typeBst or large fragment Bst DNA polymerase. In some embodiments, themodified Bst polymerase includes one or more amino acid modifications(e.g., amino acid substitutions, deletions, additions or chemicalmodifications) relative to a Bst polymerase corresponding to a referencepolymerase (e.g., wild-type Bst DNA polymerase).

In some embodiments, the modified polymerase consists of or comprises anisolated variant of a Bst DNA polymerase having or comprising the aminoacid sequence of wild type full length Bst DNA polymerase furthercomprising one or more of the following amino acid substitutions:His46Arg (H46R), Glu446Gln (E446Q), and His572Arg (H572R), wherein thenumbering is relative to the wild type amino acid sequence of Bst DNApolymerase.

In some embodiments, the modified polymerase consists of or comprises anisolated variant of a polymerase having or comprising an amino acidsequence that is at least 90%, at least 91%, at least 92%, at least 93%,at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, orat least 99% identical to the amino acid sequence of wild type fulllength Bst DNA polymerase further comprising one or more of each of thefollowing amino acid substitutions: His46Arg (H46R), Glu446Gln (E446Q),and His572Arg (H572R), wherein the numbering is relative to the wildtype full length amino acid sequence of Bst DNA polymerase. In someembodiments, the modified polymerase includes one or more amino acidmodifications (e.g., amino acid substitutions, deletions, additions orchemical modifications) relative to the reference polymerase (e.g., apolymerase lacking the one or more amino acid modifications).

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 80% identity to SEQID NO: 1. In some embodiments, the modified polymerase or thebiologically active fragment thereof comprises or consists of at least100 contiguous amino acid residues having at least 90% identity to SEQID NO: 1, and wherein the modified polymerase or biological activefragment thereof has improved thermostability as compared to SEQ IDNO: 1. In some embodiments, the modified polymerase or the biologicallyactive fragment thereof comprises or consists of at least 150 contiguousamino acid residues having at least 90% identity to SEQ ID NO: 1,wherein the modified polymerase or biologically active fragment thereofhas improved thermostability as compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 95% identity to SEQ ID NO: 1, and whereinthe modified polymerase or biological active fragment thereof hasimproved thermostability as compared to SEQ ID NO: 1. In someembodiments, the modified polymerase or the biologically active fragmentthereof comprises or consists of at least 150 contiguous amino acidresidues having at least 95% identity to SEQ ID NO: 1, wherein themodified polymerase or biologically active fragment thereof has improvedthermostability as compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 98% identity to SEQ ID NO: 1, and whereinthe modified polymerase or biological active fragment thereof hasimproved thermostability as compared to SEQ ID NO: 1. In someembodiments, the modified polymerase or the biologically active fragmentthereof comprises or consists of at least 150 contiguous amino acidresidues having at least 98% identity to SEQ ID NO: 1, wherein themodified polymerase or biologically active fragment thereof has improvedthermostability as compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 99% identity to SEQ ID NO: 1, and whereinthe modified polymerase or biological active fragment thereof hasimproved thermostability as compared to SEQ ID NO: 1. In someembodiments, the modified polymerase or the biologically active fragmentthereof comprises or consists of at least 150 contiguous amino acidresidues having at least 99% identity to SEQ ID NO: 1, wherein themodified polymerase or biologically active fragment thereof has improvedthermostability as compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 90% identity to SEQID NO: 1. In some embodiments, the modified polymerase or thebiologically active fragment thereof comprises or consists of at least100 contiguous amino acid residues having at least 90% identity to SEQID NO: 1, and wherein the modified polymerase or biological activefragment thereof has improved accuracy as compared to SEQ ID NO: 1. Insome embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 150 contiguous aminoacid residues having at least 90% identity to SEQ ID NO: 1, wherein themodified polymerase or biologically active fragment thereof has improvedaccuracy as compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 95% identity to SEQ ID NO: 1, and whereinthe modified polymerase or biological active fragment thereof hasimproved accuracy as compared to SEQ ID NO: 1. In some embodiments, themodified polymerase or the biologically active fragment thereofcomprises or consists of at least 150 contiguous amino acid residueshaving at least 95% identity to SEQ ID NO: 1, wherein the modifiedpolymerase or biologically active fragment thereof has improved accuracyas compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 98% identity to SEQ ID NO: 1, and whereinthe modified polymerase or biological active fragment thereof hasimproved accuracy as compared to SEQ ID NO: 1. In some embodiments, themodified polymerase or the biologically active fragment thereofcomprises or consists of at least 150 contiguous amino acid residueshaving at least 98% identity to SEQ ID NO: 1, wherein the modifiedpolymerase or biologically active fragment thereof has improved accuracyas compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 99% identity to SEQ ID NO: 1, and whereinthe modified polymerase or biological active fragment thereof hasimproved accuracy as compared to SEQ ID NO: 1. In some embodiments, themodified polymerase or the biologically active fragment thereofcomprises or consists of at least 150 contiguous amino acid residueshaving at least 99% identity to SEQ ID NO: 1, wherein the modifiedpolymerase or biologically active fragment thereof has improved accuracyas compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase consists of or comprises anisolated variant of a polymerase having or comprising an amino acidsequence that is at least 80% identical to the amino acid sequence ofSEQ ID NO: 2. In some embodiments, the polymerase is a variant of a TaqDNA polymerase comprising an amino acid sequence that is at least 80%,at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, atleast 94%, at least 95%, at least 96%, at least 97%, at least 98%, or atleast 99% identical to SEQ ID NO: 2. In some embodiments, the referencepolymerase is a Taq DNA polymerase consisting of the amino acid sequenceof SEQ ID NO: 2 and the modified polymerase includes one or more aminoacid modifications (e.g., amino acid substitutions, deletions, additionsor chemical modifications) relative to the reference polymerase. In someembodiments, the reference polymerase, the modified polymerase, or boththe reference and modified polymerases include a deletion orsubstitution of the methionine residue at position 1, wherein thenumbering is relative to the amino acid sequence of SEQ ID NO: 2.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 90% identity to SEQ ID NO: 2, and whereinthe modified polymerase or biological active fragment thereof hasimproved thermostability as compared to SEQ ID NO: 1. In someembodiments, the modified polymerase or the biologically active fragmentthereof comprises or consists of at least 150 contiguous amino acidresidues having at least 95% identity to SEQ ID NO: 2, wherein themodified polymerase or biologically active fragment thereof has improvedthermostability as compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 98% identity to SEQ ID NO: 2, and whereinthe modified polymerase or biological active fragment thereof hasimproved thermostability as compared to SEQ ID NO: 1. In someembodiments, the modified polymerase or the biologically active fragmentthereof comprises or consists of at least 150 contiguous amino acidresidues having at least 99% identity to SEQ ID NO: 2, wherein themodified polymerase or biologically active fragment thereof has improvedthermostability as compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 90% identity to SEQ ID NO: 2, and whereinthe modified polymerase or biological active fragment thereof hasimproved accuracy as compared to SEQ ID NO: 1. In some embodiments, themodified polymerase or the biologically active fragment thereofcomprises or consists of at least 150 contiguous amino acid residueshaving at least 95% identity to SEQ ID NO: 2, wherein the modifiedpolymerase or biologically active fragment thereof has improved accuracyas compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 98% identity to SEQ ID NO: 2, and whereinthe modified polymerase or biological active fragment thereof hasimproved accuracy as compared to SEQ ID NO: 1. In some embodiments, themodified polymerase or the biologically active fragment thereofcomprises or consists of at least 150 contiguous amino acid residueshaving at least 99% identity to SEQ ID NO: 2, wherein the modifiedpolymerase or biologically active fragment thereof has improved accuracyas compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 90% identity to SEQ ID NO: 2, and whereinthe modified polymerase or biological active fragment thereof hasimproved thermostability as compared to SEQ ID NO: 34. In someembodiments, the modified polymerase or the biologically active fragmentthereof comprises or consists of at least 150 contiguous amino acidresidues having at least 95% identity to SEQ ID NO: 2, wherein themodified polymerase or biologically active fragment thereof has improvedthermostability as compared to SEQ ID NO: 34.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 98% identity to SEQ ID NO: 2, and whereinthe modified polymerase or biological active fragment thereof hasimproved thermostability as compared to SEQ ID NO: 34. In someembodiments, the modified polymerase or the biologically active fragmentthereof comprises or consists of at least 150 contiguous amino acidresidues having at least 99% identity to SEQ ID NO: 2, wherein themodified polymerase or biologically active fragment thereof has improvedthermostability as compared to SEQ ID NO: 34.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 90% identity to SEQ ID NO: 2, and whereinthe modified polymerase or biological active fragment thereof hasimproved accuracy as compared to SEQ ID NO: 34. In some embodiments, themodified polymerase or the biologically active fragment thereofcomprises or consists of at least 150 contiguous amino acid residueshaving at least 95% identity to SEQ ID NO: 2, wherein the modifiedpolymerase or biologically active fragment thereof has improved accuracyas compared to SEQ ID NO: 34.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 98% identity to SEQ ID NO: 2, and whereinthe modified polymerase or biological active fragment thereof hasimproved accuracy as compared to SEQ ID NO: 34. In some embodiments, themodified polymerase or the biologically active fragment thereofcomprises or consists of at least 150 contiguous amino acid residueshaving at least 99% identity to SEQ ID NO: 2, wherein the modifiedpolymerase or biologically active fragment thereof has improved accuracyas compared to SEQ ID NO: 34.

In some embodiments, the modified polymerase can include an amino acidsequence or any biologically active fragment thereof having orcomprising the amino acid sequence of SEQ ID NO: 3. In some embodiments,the modified polymerase can include an amino acid sequence of anybiologically active fragment of a polymerase having or comprising anamino acid sequence that is at least 80%, at least 85%, at least 90%, atleast 95%, at least 96%, at least 97%, at least 98%, or at least 99%identical to the amino acid sequence of SEQ ID NO: 3. In someembodiments, the reference polymerase is a Taq DNA polymerase consistingof the amino acid sequence of SEQ ID NO: 3 and the modified polymeraseincludes one or more amino acid modifications (e.g., amino acidsubstitutions, deletions, additions or chemical modifications) relativeto the reference polymerase. In some embodiments, the referencepolymerase, the modified polymerase, or both the reference and modifiedpolymerases include a deletion or substitution of the methionine residueat position 1, wherein the numbering is relative to the amino acidsequence of SEQ ID NO: 3.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 90% identity to SEQ ID NO: 3, and whereinthe modified polymerase or biological active fragment thereof hasimproved thermostability as compared to SEQ ID NO: 1. In someembodiments, the modified polymerase or the biologically active fragmentthereof comprises or consists of at least 150 contiguous amino acidresidues having at least 95% identity to SEQ ID NO: 3, wherein themodified polymerase or biologically active fragment thereof has improvedthermostability as compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 98% identity to SEQ ID NO: 3, and whereinthe modified polymerase or biological active fragment thereof hasimproved thermostability as compared to SEQ ID NO: 1. In someembodiments, the modified polymerase or the biologically active fragmentthereof comprises or consists of at least 150 contiguous amino acidresidues having at least 99% identity to SEQ ID NO: 3, wherein themodified polymerase or biologically active fragment thereof has improvedthermostability as compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 90% identity to SEQ ID NO: 3, and whereinthe modified polymerase or biological active fragment thereof hasimproved accuracy as compared to SEQ ID NO: 1. In some embodiments, themodified polymerase or the biologically active fragment thereofcomprises or consists of at least 150 contiguous amino acid residueshaving at least 95% identity to SEQ ID NO: 3, wherein the modifiedpolymerase or biologically active fragment thereof has improved accuracyas compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 98% identity to SEQ ID NO: 3, and whereinthe modified polymerase or biological active fragment thereof hasimproved accuracy as compared to SEQ ID NO: 1. In some embodiments, themodified polymerase or the biologically active fragment thereofcomprises or consists of at least 150 contiguous amino acid residueshaving at least 99% identity to SEQ ID NO: 3, wherein the modifiedpolymerase or biologically active fragment thereof has improved accuracyas compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 90% identity to SEQ ID NO: 3, and whereinthe modified polymerase or biological active fragment thereof hasimproved thermostability as compared to SEQ ID NO: 34. In someembodiments, the modified polymerase or the biologically active fragmentthereof comprises or consists of at least 150 contiguous amino acidresidues having at least 95% identity to SEQ ID NO: 3, wherein themodified polymerase or biologically active fragment thereof has improvedthermostability as compared to SEQ ID NO: 34.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 98% identity to SEQ ID NO: 3, and whereinthe modified polymerase or biological active fragment thereof hasimproved thermostability as compared to SEQ ID NO: 34. In someembodiments, the modified polymerase or the biologically active fragmentthereof comprises or consists of at least 150 contiguous amino acidresidues having at least 99% identity to SEQ ID NO: 3, wherein themodified polymerase or biologically active fragment thereof has improvedthermostability as compared to SEQ ID NO: 34.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 90% identity to SEQ ID NO: 3, and whereinthe modified polymerase or biological active fragment thereof hasimproved accuracy as compared to SEQ ID NO: 34. In some embodiments, themodified polymerase or the biologically active fragment thereofcomprises or consists of at least 150 contiguous amino acid residueshaving at least 95% identity to SEQ ID NO: 3, wherein the modifiedpolymerase or biologically active fragment thereof has improved accuracyas compared to SEQ ID NO: 34.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 98% identity to SEQ ID NO: 3, and whereinthe modified polymerase or biological active fragment thereof hasimproved accuracy as compared to SEQ ID NO: 34. In some embodiments, themodified polymerase or the biologically active fragment thereofcomprises or consists of at least 150 contiguous amino acid residueshaving at least 99% identity to SEQ ID NO: 3, wherein the modifiedpolymerase or biologically active fragment thereof has improved accuracyas compared to SEQ ID NO: 34.

In some embodiments, the modified polymerase consists of or comprises anisolated variant of a polymerase having or comprising an amino acidsequence that is at least 80% identical to the amino acid sequence ofSEQ ID NO: 4. In some embodiments, the polymerase is a variant of a TaqDNA polymerase comprising an amino acid sequence that is at least 80%,at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, atleast 94%, at least 95%, at least 96%, at least 97%, at least 98%, or atleast 99% identical to SEQ ID NO: 4. In some embodiments, the referencepolymerase is a Taq DNA polymerase consisting of the amino acid sequenceof SEQ ID NO: 4 and the modified polymerase includes one or more aminoacid modifications (e.g., amino acid substitutions, deletions, additionsor chemical modifications) relative to the reference polymerase. In someembodiments, the reference polymerase, the modified polymerase, or boththe reference and modified polymerases include a deletion orsubstitution of the methionine residue at position 1, wherein thenumbering is relative to the amino acid sequence of SEQ ID NO: 4.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 90% identity to SEQ ID NO: 4, and whereinthe modified polymerase or biological active fragment thereof hasimproved thermostability as compared to SEQ ID NO: 1. In someembodiments, the modified polymerase or the biologically active fragmentthereof comprises or consists of at least 150 contiguous amino acidresidues having at least 95% identity to SEQ ID NO: 4, wherein themodified polymerase or biologically active fragment thereof has improvedthermostability as compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 98% identity to SEQ ID NO: 4, and whereinthe modified polymerase or biological active fragment thereof hasimproved thermostability as compared to SEQ ID NO: 1. In someembodiments, the modified polymerase or the biologically active fragmentthereof comprises or consists of at least 150 contiguous amino acidresidues having at least 99% identity to SEQ ID NO: 4, wherein themodified polymerase or biologically active fragment thereof has improvedthermostability as compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 90% identity to SEQ ID NO: 4, and whereinthe modified polymerase or biological active fragment thereof hasimproved accuracy as compared to SEQ ID NO: 1. In some embodiments, themodified polymerase or the biologically active fragment thereofcomprises or consists of at least 150 contiguous amino acid residueshaving at least 95% identity to SEQ ID NO: 4, wherein the modifiedpolymerase or biologically active fragment thereof has improved accuracyas compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 98% identity to SEQ ID NO: 4, and whereinthe modified polymerase or biological active fragment thereof hasimproved accuracy as compared to SEQ ID NO: 1. In some embodiments, themodified polymerase or the biologically active fragment thereofcomprises or consists of at least 150 contiguous amino acid residueshaving at least 99% identity to SEQ ID NO: 4, wherein the modifiedpolymerase or biologically active fragment thereof has improved accuracyas compared to SEQ ID NO: 1.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 90% identity to SEQ ID NO: 4, and whereinthe modified polymerase or biological active fragment thereof hasimproved thermostability as compared to SEQ ID NO: 34. In someembodiments, the modified polymerase or the biologically active fragmentthereof comprises or consists of at least 150 contiguous amino acidresidues having at least 95% identity to SEQ ID NO: 4, wherein themodified polymerase or biologically active fragment thereof has improvedthermostability as compared to SEQ ID NO: 34.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 98% identity to SEQ ID NO: 4, and whereinthe modified polymerase or biological active fragment thereof hasimproved thermostability as compared to SEQ ID NO: 34. In someembodiments, the modified polymerase or the biologically active fragmentthereof comprises or consists of at least 150 contiguous amino acidresidues having at least 99% identity to SEQ ID NO: 4, wherein themodified polymerase or biologically active fragment thereof has improvedthermostability as compared to SEQ ID NO: 34.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 90% identity to SEQ ID NO: 4, and whereinthe modified polymerase or biological active fragment thereof hasimproved accuracy as compared to SEQ ID NO: 34. In some embodiments, themodified polymerase or the biologically active fragment thereofcomprises or consists of at least 150 contiguous amino acid residueshaving at least 95% identity to SEQ ID NO: 4, wherein the modifiedpolymerase or biologically active fragment thereof has improved accuracyas compared to SEQ ID NO: 34.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 100 contiguous aminoacid residues having at least 98% identity to SEQ ID NO: 4, and whereinthe modified polymerase or biological active fragment thereof hasimproved accuracy as compared to SEQ ID NO: 34. In some embodiments, themodified polymerase or the biologically active fragment thereofcomprises or consists of at least 150 contiguous amino acid residueshaving at least 99% identity to SEQ ID NO: 4, wherein the modifiedpolymerase or biologically active fragment thereof has improved accuracyas compared to SEQ ID NO: 34.

In some embodiments, the disclosure relates generally to a modifiedpolymerase that includes an isolated variant of a Taq DNA polymerasecomprising an amino acid sequence selected from the group consisting of:SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9,SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO:14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ IDNO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28,SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO:33.

In some embodiments, the disclosure relates generally to a modifiedpolymerase that includes an isolated variant of a Taq DNA polymerasecomprising an amino acid sequence selected from the group consisting of:SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9,SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO:14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ IDNO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28,SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO:33, and further including one or more amino acid substitutions that anot naturally occurring. Optionally, the modified polymerase includesone, two, three, four, five, or more amino acid substitutions relativeto the amino acid sequence of SEQ ID NO: 1 or 34.

In some embodiments, the reference polymerase can include a Taq DNApolymerase having, or comprising the amino acid sequence of SEQ ID NO:5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO:10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ IDNO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24,SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO:29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33; wherethe modified polymerase comprises a variant of the reference polymerase,thereby the modified polymerase further includes one, two, three, four,five, or more amino acid substitutions relative to the referencepolymerase. In some embodiments, the modified polymerase comprises orconsists of an amino acid sequence that is at least 80%, at least 85%,at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, atleast 95%, at least 96%, at least 97%, at least 98%, or at least 99%identical to the amino acid sequence of the reference polymerase but istypically less than 100% identical with respect to amino acid sequence.In some embodiments, the one, two, three, four, five, or more amino acidsubstitutions relative to the reference polymerase can include at leastone conservative amino acid substitution.

In some embodiments, the modified polymerase or the biologically activefragment thereof having improved thermostability and/or improvedaccuracy relative to the reference polymerase (e.g., SEQ ID NO: 1 or SEQID NO: 34), comprises or consists of at least 80%, at least 85%, atleast 90%, at least 95%, at least 96%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ IDNO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ IDNO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19,SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO:24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ IDNO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the modified polymerase or the biologically activefragment thereof further comprises at least 25 contiguous amino acids ofthe polymerase DNA binding domain. In some embodiments, the modifiedpolymerase or the biologically active fragment thereof comprises atleast 50 contiguous amino acid residues of the polymerase DNA bindingdomain. In some embodiments, the modified polymerase or the biologicallyactive fragment thereof comprises or consists of at least 100 contiguousamino acid residues of the polymerase DNA binding domain. In someembodiments, the modified polymerase or the biologically active fragmentthereof comprises or consists of at least 100 contiguous amino acidresidues of the polymerase DNA binding domain, while also having atleast 80%, at least 85%, at least 90%, at least 95%, at least 98%, or atleast 99% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ IDNO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ IDNO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19,SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO:24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ IDNO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the modified polymerase or the biologically activefragment thereof comprises or consists of at least 200 contiguous aminoacid residues of the polymerase DNA binding domain, while also having atleast 80%, at least 85%, at least 90%, at least 95%, at least 98%, or atleast 99% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ IDNO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ IDNO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19,SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO:24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ IDNO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the disclosure is generally related to acomposition comprising an isolated polypeptide having at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ IDNO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ IDNO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19,SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO:24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ IDNO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the disclosure is generally related to acomposition comprising an isolated nucleic acid having at least 80%identity to SEQ ID NO: 1 and further comprising at least one amino acidsubstitution selected from the group consisting of P6, A77, A97, L193,K240, R266, E267, L287, P291, K292, E295, E397, G418, L490, A502, S543,D578, R593, L678, S699, E713, V737, E745, L763, E790, E794, E805 andL828, wherein the numbering is specific to the numbering of amino acidresidues of SEQ ID NO: 1.

In some embodiments, the disclosure is generally related to acomposition comprising an isolated nucleic acid having at least 80%identity to SEQ ID NO: 1 and further comprising at least one amino acidsubstitution selected from the group consisting of P6N, A77E, A97V,L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V,G418C, L490Q, A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W,V737A, E745T, L763F, E790G, E794C, E805I and L828A, wherein thenumbering is specific to the numbering of amino acid residues of SEQ IDNO: 1.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 1 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C,L490Q, A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A,E745T, L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 2 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 3 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C,L490Q, A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A,E745T, E790G, E794C and L828A.

In some embodiments, disclosure is generally related to an isolated andpurified polypeptide comprising or consisting of at least 80%, at least85%, at least 90%, at least 95%, at least 97%, at least 98%, or at least99% identity to SEQ ID NO: 4 and having one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, G418C, L490Q, A502S,S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E790G, E794C,E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 5 and having one or more amino acidmutations selected from the group consisting of A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C,L490Q, A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A,E745T, L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 6 and having one or more amino acidmutations selected from the group consisting of P6N, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 7 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 8 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 9 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 10 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 11 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 12 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 13 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 14 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 15 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 16 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 17 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 18 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 19 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C,L490Q, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 20 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C,L490Q, A502S, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 21 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C,L490Q, A502S, S543V, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 22 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C,L490Q, A502S, S543V, D578E, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 23 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C,L490Q, A502S, S543V, D578E, R593G, L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 24 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C,L490Q, A502S, S543V, D578E, R593G, L678F, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 25 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C,L490Q, A502S, S543V, D578E, R593G, L678F or L678T, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 26 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C,L490Q, A502S, S543V, D578E, R593G, L678F or L678T, S699W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 27 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C,L490Q, A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 28 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C,L490Q, A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 29 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C,L490Q, A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A,E745T, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 30 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C,L490Q, A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A,E745T, L763F, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 31 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C,L490Q, A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A,E745T, L763F, E790G, E805I and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 32 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C,L490Q, A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A,E745T, L763F, E790G, E794C and L828A.

In some embodiments, the disclosure is generally related to an isolatedand purified polypeptide comprising or consisting of at least 80%, atleast 85%, at least 90%, at least 95%, at least 97%, at least 98%, or atleast 99% identity to SEQ ID NO: 33 and having one or more amino acidmutations selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C,L490Q, A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A,E745T, L763F, E790G, E794C and E805I.

In some embodiments, the composition comprises at least 80% identity toSEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6,SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11,SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO:16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ IDNO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30,SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33 and further comprises atleast one amino acid substitution selected from the group consisting ofP6, A77, A97, L193, K240, R266, E267, L287, P291, K292, E295, E397,G418, L490, A502, S543, D578, R593, L678, S699, E713, V737, E745, L763,E790, E794, E805 and L828, wherein the numbering is specific to thenumbering of amino acid residues of SEQ ID NO: 1. In some embodiments,the amino acid substitution comprises a conservative amino acid asubstitution.

In some embodiments, the composition comprises at least 80% identity toSEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6,SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11,SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO:16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ IDNO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30,SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33 and further comprises atleast one amino acid substitution selected from the group consisting ofP6N, A77E, A97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295For E295N, E397V, G418C, L490Q, A502S, S543V, D578E, R593G, L678F orL678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A,wherein the numbering is specific to the numbering of amino acidresidues of SEQ ID NO: 1.

In some embodiments, the composition comprises at least 85%, at least90%, at least 95%, at least 98% or at least 99% identity to SEQ ID NO:2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7,SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12,SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO:17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ IDNO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31,SEQ ID NO: 32 or SEQ ID NO: 33, and further comprising at least oneamino acid substitution selected from the group consisting of P6N, A77E,A97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N,E397V, G418C, L490Q, A502S, S543V, D578E, R593G, L678F or L678T, S699W,E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A, wherein thenumbering is specific to the numbering of amino acid residues of SEQ IDNO: 1.

In some embodiments, the modified polymerase can include anyone or moreamino acid substitutions selected from the group consisting of P6N,A77E, A97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F orE295N, E397V, G418C, L490Q, A502S, S543V, D578E, R593G, L678F or L678T,S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A,wherein the numbering is relative to the amino acid sequence of SEQ IDNO: 1. In some embodiments, the modified polymerase has improvedaccuracy and/or improved thermostability relative to the referencepolymerase. Without being bound to any particular theory of operation,it can be observed that in some embodiments one or more of theaforementioned substitutions can alter, e.g., increase or decrease theaccuracy or thermostability of the modified polymerase relative to areference (e.g., unmodified) polymerase. In some embodiments, suchincrease in accuracy and/or thermostability can be observed as anincrease in signal produced an ion-based sequencing reaction.

In some embodiments, the reference polymerase, the modified polymerase,or both the reference and modified polymerases can further include adeletion of the methionine residue at position 1, or a substitution ofthe methionine residue at position 1 with any other amino acid residue,wherein the numbering is relative to the amino acid sequence of SEQ IDNO: 1 or SEQ ID NO: 34.

In some embodiments, the disclosure is generally related to an isolatednucleic acid sequence comprising or consisting of a nucleic acidsequencing encoding a polypeptide having at least 80%, at least 85%, atleast 90%, at least 95%, at least 97%, at least 98%, or at least 99%identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15,SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO:20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ IDNO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the disclosure is generally related to acomposition comprising an isolated nucleic acid sequence comprising orconsisting of a nucleic acid sequence encoding a polypeptide having atleast 80%, at least 85%, at least 90%, at least 95%, at least 97%, atleast 98%, or at least 99% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ IDNO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18,SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO:23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ IDNO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 orSEQ ID NO: 33, and further comprising one or more amino acid mutationsselected from the group consisting of P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A.

In some embodiments, the disclosure is generally related to a vectorcomprising an isolated nucleic sequence encoding a polypeptide or abiologically active fragment thereof selected from the group consistingof SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6,SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11,SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO:16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ IDNO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30,SEQ ID NO: 31, SEQ ID NO: 32 and SEQ ID NO: 33. In some embodiments, thevector comprising the isolated nucleic acid sequence encoding apolypeptide or biologically active fragment thereof includes a DNApolymerase. In some embodiments, the DNA polymerase is a Thermusaquaticus (Taq) polymerase. In some embodiments, the DNA polymerase is athermostable DNA polymerase. In some embodiments, the DNA polymerase isderived from a thermostable Thermus aquaticus (Taq) polymerase.

In some embodiments, the disclosure is generally related to a vectorcomprising an isolated nucleic acid sequence encoding a polypeptide or abiologically active fragment thereof that comprises a homolog of Taq DNApolymerase, wherein the homolog of Taq DNA polymerase includes at leastone amino acid substitution corresponding to the amino acidsubstitutions present in any one of SEQ ID NO: 2, SEQ ID NO: 3, SEQ IDNO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ IDNO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18,SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO:23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ IDNO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 orSEQ ID NO: 33.

In some embodiments, the disclosure is generally related to a kitcomprising an isolated polypeptide having at least 80% identity to SEQID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 or SEQ ID NO: 33. In some embodiments, the kitcomprises an isolated polypeptide having at least 90%, at least 95%, atleast 96%, at least 97%, at least 98% or at least 99% identity to SEQ IDNO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the kit comprises an isolated polypeptidecomprising or consisting of at least 250, at least 300, at least 350, atleast 400, at least 450, at least 500, at least 550, at least 600, or atleast 650 contiguous amino acid residues having at least 90% identity toSEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6,SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11,SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO:16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ IDNO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30,SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33. In some embodiments, thekit further includes one or more suitable buffers, MgCl and dNTPs.

In some embodiments, the disclosure generally relates to a system (andrelated apparatus, kits, methods and compositions) for amplifying one ormore nucleic acids. In some embodiments, the system can comprise a DNApolymerase having at least one mutation (e.g., substitution, insertion,deletion, fusion, and the like) as compared to the amino acid sequenceof SEQ ID NO: 1 or SEQ ID NO: 34; a solid support comprising a nucleicacid molecule to be amplified; a mixture of nucleotides (e.g., dNTP,ddNTPs, and the like); and conditions under which the nucleic acidmolecule is amplified on the solid support. In some embodiments, theamplification can include clonal amplification or bridge-PCRamplification. In some embodiments, the amplification can includeproximity ligation amplification, rolling circle amplification, PCRamplification, isothermal amplification, recombinase polymeraseamplification, strand displacement amplification, emulsion PCRamplification, and the like. In illustrative embodiments, the DNApolymerase is a modified polymerase that includes any of the followingmutations: P6N, A77E, A97V, L193V, K240I, R266Q, E267T, L287T, P291T,K292C, E295F or E295N, E397V, G418C, L490Q, A502S, S543V, D578E, R593G,L678F or L678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805Iand L828A, wherein the numbering is relative to the amino acid sequenceof SEQ ID NO: 1.

In some embodiments, the disclosure generally relates to a polymerase ora biologically active fragment thereof having DNA polymerase activityand at least 80%, at least 90%, at least 95%, at least 96%, at least97%, at least 98% or at least 99% identity identity to SEQ ID NO: 2, SEQID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ IDNO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ IDNO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22,SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO:27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ IDNO: 32 and SEQ ID NO: 33, wherein the polymerase or the biologicallyactive fragment having DNA polymerase activity includes at least oneamino acid substitution as compared to SEQ ID NO: 1 or SEQ ID NO: 34. Insome embodiments the polymerase or biologically active fragment thereofincludes at least two, three, four, five, or more amino acidsubstitutions as compared to SEQ ID NO: 1 or SEQ ID NO: 34.

In some embodiments, the at least one amino acid substitution ascompared to SEQ ID NO: 1 or SEQ ID NO: 34 can impart a beneficialproperty to the polymerase or biologically active fragment thereof. Insome embodiments, the beneficial property imparted to the polymerase orbiologically active fragment thereof (as compared to SEQ ID NO: 1 or SEQID NO: 34) includes improved thermostability, improved read length,improved templating efficiency, improved performance in a high ionicstrength solution or improved accuracy. In some embodiments, thebeneficial property imparted to the polymerase or biologically activefragment thereof (as compared to SEQ ID NO: 1 or SEQ ID NO: 34) includesreduced strand bias of GC and AT rich nucleic acids. It will begenerally understood to those of ordinary skill in the art that thebeneficial property imparted to the polymerase or biological fragment(as compared to the properties of SEQ ID NO: 1 or SEQ ID NO: 34) can bedetermined by assessing and/or measuring such beneficial propertiesunder identical conditions by any appropriate means (e.g., comparing theproperties of SEQ ID NO: 1 against the polymerase or biologically activefragment thereof under identical conditions). For example, the accuracyof a DNA polymerase can be measured in terms of the longest perfect read(typically measured in terms of the number of nucleotides correctlyincluded in the read) obtained from a nucleotide polymerizationreaction. In some embodiments, the nucleotide polymerization reactioncan be conducted using emulsion PCR, bridge PCR or hot-start PCRconditions. In some embodiments, one or more of the beneficialproperties imparted to the polymerase or biologically active fragmentthereof can be determined by assessing sequencing accuracy. In someembodiments, sequencing accuracy can be determined using anynext-generation (i.e. massively parallel, high throughput) sequencingplatform (e.g., Ion Torrent Systems, Illumina HiSeq or True Seq or X-10systems). In some embodiments, sequencing accuracy can be determinedusing any ISFET based sequencing system. However, it will be apparentthat other appropriate methods to determine improved thermostabilityand/or improved accuracy may be used and are contemplated within thescope of the present disclosure.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of a biologically active fragment of SEQ ID NO: 2, SEQ ID NO:3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8,SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO:13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ IDNO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27,SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO:32 or SEQ ID NO: 33 that retains polymerase activity. In someembodiments, the polymerase activity, characteristic, or property isselected from primer extension activity, strand displacement activity,proofreading activity, nick-initiated polymerase activity, reversetranscriptase activity, accuracy, average read length, thermostability,processivity, strand bias or nucleotide polymerization activity. In someembodiments, the polymerase activity, characteristic, or property isselected from one or more sequencing based metrics selected from rawread accuracy, average read length, thermostability or processivity.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of a biologically active fragment of SEQ ID NO: 2, SEQ ID NO:3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8,SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO:13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ IDNO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27,SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO:32 or SEQ ID NO: 33 having polymerase activity selected from improvedread length, improved accuracy or improved thermostability as comparedto polymerase activity of SEQ ID NO: 1 or SEQ ID NO: 34 under identicalconditions. In some embodiments, the polymerase activity is determinedin the presence of a high ionic strength solution. In some embodimentsthe high ionic strength solution is at least 120 mM Kcl. In someembodiments, the high ionic strength solution is from 125 mM KCl to 200mM KCl.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 90% identity to SEQ ID NO: 1 and furthercomprising a E397 amino acid substitution, wherein the numbering isrelative to SEQ ID NO: 1. In some embodiments, the disclosure generallyrelates to a substantially purified polymerase having an amino acidsequence comprising or consisting of at least 90% identity to SEQ ID NO:1 and further comprising a E397V amino acid substitution, wherein thenumbering is relative to SEQ ID NO: 1.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 90% identity to SEQ ID NO: 1 and furthercomprising a L763 amino acid substitution, wherein the numbering isrelative to SEQ ID NO: 1. In some embodiments, the disclosure generallyrelates to a substantially purified polymerase having an amino acidsequence comprising or consisting of at least 90% identity to SEQ ID NO:1 and further comprising a L763F amino acid substitution, wherein thenumbering is relative to SEQ ID NO: 1.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 90% identity to SEQ ID NO: 1 and furthercomprising a E805 amino acid substitution, wherein the numbering isrelative to SEQ ID NO: 1. In some embodiments, the disclosure generallyrelates to a substantially purified polymerase having an amino acidsequence comprising or consisting of at least 90% identity to SEQ ID NO:1 and further comprising a E805I amino acid substitution, wherein thenumbering is relative to SEQ ID NO: 1.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 90% identity to SEQ ID NO: 1 and furthercomprising a E745 amino acid substitution, wherein the numbering isrelative to SEQ ID NO: 1. In some embodiments, the disclosure generallyrelates to a substantially purified polymerase having an amino acidsequence comprising or consisting of at least 90% identity to SEQ ID NO:1 and further comprising a E745T amino acid substitution, wherein thenumbering is relative to SEQ ID NO: 1.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 90% identity to SEQ ID NO: 34 and furthercomprising a E397 amino acid substitution, wherein the numbering isrelative to SEQ ID NO: 34. In some embodiments, the disclosure generallyrelates to a substantially purified polymerase having an amino acidsequence comprising or consisting of at least 90% identity to SEQ ID NO:34 and further comprising a E397V amino acid substitution, wherein thenumbering is relative to SEQ ID NO: 34.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 90% identity to SEQ ID NO: 34 and furthercomprising a L763 amino acid substitution, wherein the numbering isrelative to SEQ ID NO: 34. In some embodiments, the disclosure generallyrelates to a substantially purified polymerase having an amino acidsequence comprising or consisting of at least 90% identity to SEQ ID NO:34 and further comprising a L763F amino acid substitution, wherein thenumbering is relative to SEQ ID NO: 34.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 90% identity to SEQ ID NO: 34 and furthercomprising a E805 amino acid substitution, wherein the numbering isrelative to SEQ ID NO: 34. In some embodiments, the disclosure generallyrelates to a substantially purified polymerase having an amino acidsequence comprising or consisting of at least 90% identity to SEQ ID NO:34 and further comprising a E805 amino acid substitution, wherein thenumbering is relative to SEQ ID NO: 34.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 90% identity to SEQ ID NO: 34 and furthercomprising a E745 amino acid substitution, wherein the numbering isrelative to SEQ ID NO: 34. In some embodiments, the disclosure generallyrelates to a substantially purified polymerase having an amino acidsequence comprising or consisting of at least 90% identity to SEQ ID NO:34 and further comprising a E745T amino acid substitution, wherein thenumbering is relative to SEQ ID NO: 34.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 90% identity to SEQ ID NO: 1 and furthercomprising E397, E745 and L763 amino acid substitutions, wherein thenumbering is relative to SEQ ID NO: 1. In some embodiments, thedisclosure generally relates to a substantially purified polymerasehaving an amino acid sequence comprising or consisting of at least 90%identity to SEQ ID NO: 1 and further comprising E397V, E745T and L763Famino acid substitutions, wherein the numbering is relative to SEQ IDNO: 1.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 90% identity to SEQ ID NO: 34 and furthercomprising E397, E745 and L763 amino acid substitutions, wherein thenumbering is relative to SEQ ID NO: 34. In some embodiments, thedisclosure generally relates to a substantially purified polymerasehaving an amino acid sequence comprising or consisting of at least 90%identity to SEQ ID NO: 34 and further comprising E397V, E745T and L763Famino acid substitutions, wherein the numbering is relative to SEQ IDNO: 34.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 90% identity to SEQ ID NO: 1 and furthercomprising E805 and L763 amino acid substitutions, wherein the numberingis relative to SEQ ID NO: 1. In some embodiments, the disclosuregenerally relates to a substantially purified polymerase having an aminoacid sequence comprising or consisting of at least 90% identity to SEQID NO: 1 and further comprising E805I and L763F amino acidsubstitutions, wherein the numbering is relative to SEQ ID NO: 1.

In some embodiments, the disclosure generally relates to a substantiallypurified polymerase having an amino acid sequence comprising orconsisting of at least 90% identity to SEQ ID NO: 34 and furthercomprising E805 and L763 amino acid substitutions, wherein the numberingis relative to SEQ ID NO: 34. In some embodiments, the disclosuregenerally relates to a substantially purified polymerase having an aminoacid sequence comprising or consisting of at least 90% identity to SEQID NO: 34 and further comprising E805I and L763F amino acidsubstitutions, wherein the numbering is relative to SEQ ID NO: 34.

In some embodiments, the reference polymerase has or comprises the aminoacid sequence of SEQ ID NO: 2, SEQ ID NO: 3 or SEQ ID NO: 4, and themodified polymerase has or comprises the amino acid sequence of thereference polymerase, further including one or more amino acid mutationsas compared to the reference polymerase. In some embodiments, the aminoacid mutations include a substitution of the existing amino acid residueat the indicated position with any other amino acid residue (includingboth naturally occurring and non-natural amino acid residues). In someembodiments, the amino acid substitution is a conservative substitution;alternatively, the amino acid substitution can be a non-conservativesubstitution. In some embodiments, the reference polymerase, themodified polymerase, or both the reference and modified polymerases canfurther include a deletion of the methionine residue at position 1, or asubstitution of the methionine residue at position 1 with any otheramino acid residue, wherein the numbering is relative to the amino acidsequence of SEQ ID NO: 1. In some embodiments, the modified polymeraseexhibits a change in any one or more parameters selected from the groupconsisting of: average read length, accuracy, total sequencingthroughout, strand bias, lowered systematic error, enhanced polymeraseperformance in high ionic strength solution, improved processivity,improved performance in PCR, performance in emulsion PCR, relative tothe reference polymerase. Optionally, the change in any one or moreparameters is observed by comparing the performance of the reference andmodified polymerases in an ion-based sequencing reaction.

Without being bound to any particular theory of operation, it can beobserved that in some embodiments a modified polymerase including one ormore of the disclosed amino acid substitutions exhibits an altered(e.g., increased) processivity relative to an unmodified polymerase, oran altered (e.g., decreased) strand bias relative to the unmodifiedpolymerase. In some embodiments, the modified polymerase exhibits analtered (e.g., increased) accuracy relative to the unmodifiedpolymerase. In some embodiments, the modified polymerase exhibits analtered, (e.g., increased) average error-free read length, or altered(e.g., increased) observed 100Q17 or 200Q17 values relative to thereference polymerase. In some embodiments, the modified polymerase haspolymerase activity. In some embodiments, the modified polymerase orbiologically active fragment can have primer extension activity in vivoor in vitro.

In some embodiments, the one or more mutations in the modifiedpolymerase can include at least one amino acid substitution. The atleast one amino acid substitution can optionally occur at any one ormore positions selected from the group consisting of P6, A77, A97, L193,K240, R266, E267, L287, P291, K292, E295, E397, G418, L490, A502, S543,D578, R593, L678, S699, E713, V737, E745, L763, E790, E794, E805 andL828, wherein the numbering is relative to the amino acid sequence ofSEQ ID NO: 1. In some embodiments, the modified polymerase includes atleast two, three, four, five, or more amino acid substitutions occurringat positions selected from this group. In some embodiments, the at leastone amino acid substitution can optionally occur at any one or morepositions selected from the group consisting of P6N, A77E, A97V, L193V,K240I, R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C,L490Q, A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A,E745T, L763F, E790G, E794C, E805 and L828A, wherein the numbering isrelative to the amino acid sequence of SEQ ID NO: 1. In someembodiments, the modified polymerase includes at least two, three, four,five, or more amino acid substitutions occurring at positions selectedfrom this group.

Without being bound to any particular theory of operation, it can beobserved that in some embodiments a modified polymerase including anyone of such amino acid substitutions exhibits an altered (e.g.,increased or decreased) thermostability relative to an unmodifiedpolymerase, or altered (e.g., increased or decreased) accuracy relativeto the corresponding unmodified polymerase, or relative to the referencepolymerase. It will be apparent to one of ordinary skill in the art,that some amino acid residues of the modified polymerase may be highlyconserved amino acid residues. It is anticipated that one of ordinaryskill in the art can construct, express, and determine which amino acidresidues, if any, in a given polymerase are highly conserved by wellknown means (for example, see U.S. Pat. Nos. 5,436,149; 6,395,524;6,982,144; 7,312,059 and 8,420,325, all of which are incorporated hereinin their entireties).

In some embodiments, the modified polymerase can include aTaq DNApolymerase. In some embodiments, the polymerase can include a Taq DNApolymerase commercially available as Platinum Taq High Fidelity DNApolymerase (Life Technologies, CA), that includes one or more amino acidmutations as compared to the reference polymerase. In some embodiments,the modified polymerase can include a Taq DNA polymerase having, orcomprising, the amino acid sequence of SEQ ID NO: 1, which is the aminoacid sequence of the wild-type Taq DNA polymerase.

In some embodiments, the modified polymerase includes a mutant orvariant form of a Taq DNA polymerase that retains a detectable level ofpolymerase activity. In order to retain the polymerase activity of theTaq DNA polymerase, any substitutions, deletions or chemicalmodifications will be made to amino acid residues that are not highlyconserved, such as the invariant aspartic acid residues required forpolymerase activity. In some embodiments, the modified polymerase caninclude Taq DNA polymerase, a hot-start Taq DNA polymerase, a chemicalhot-start Taq DNA polymerase, a Platinium Taq DNA polymerase and thelike.

In some embodiments, the modified polymerase can include an isolatedvariant of a polymerase having or comprising an amino acid sequence thatis at least 90% identical to the amino acid sequence of SEQ ID NO: 2. Insome embodiments, the polymerase is a variant of a Taq DNA polymerasecomprising the amino acid sequence of SEQ ID NO: 4, wherein the variantcomprises an amino acid sequence that is at least 80%, at least 85%, atleast 90%, at least 91%, at least 92%, at least 93%, at least 94%, atleast 95%, at least 96%, at least 97%, at least 98%, or at least 99%identical to SEQ ID NO: 2.

In some embodiments, the modified polymerase includes a mutant orvariant form of Taq DNA polymerase having amino acid mutation E397V,wherein the numbering is relative to the amino acid sequence of SEQ IDNO: 1. In some embodiments, the modified Taq DNA polymerase can includeamino acid mutation L763F, wherein the numbering is relative to theamino acid sequence of SEQ ID NO: 1. In some embodiments, the modifiedTaq DNA polymerase can include amino acid mutation E805I, wherein thenumbering is relative to the amino acid sequence of SEQ ID NO: 1. Insome embodiments, the modified Taq DNA polymerase can include amino acidmutation E745T, wherein the numbering is relative to the amino acidsequence of SEQ ID NO: 1. In some embodiments, the modified Taq DNApolymerase can include amino acid mutation A97V, wherein the numberingis relative to the amino acid sequence of SEQ ID NO: 1. In someembodiments, the modified Taq DNA polymerase can include amino acidmutation E295F, wherein the numbering is relative to the amino acidsequence of SEQ ID NO: 1. In some embodiments, the modified Taq DNApolymerase can include amino acid mutation P6N, wherein the numbering isrelative to the amino acid sequence of SEQ ID NO: 1. In some embodimentsa modified polymerase including one or more of the above mutationsexhibits altered (e.g., increased or decreased) accuracy relative to thecorresponding reference polymerase (e.g., unmodified polymerase SEQ IDNO: 1). In some embodiments, the modified Taq polymerase has an altered(e.g., increased or decreased) thermostability relative to the referencepolymerase (e.g., unmodified polymerase SEQ ID NO: 1). In someembodiments, the modified Taq polymerase exhibits an altered (e.g.,increased or decreased) read length, or an altered (e.g., increased ordecreased) strand bias, or altered (e.g., increased or decreased)processivity, or altered systematic error (e.g., increased ordecreased), or altered (e.g., increased or decreased) observed 100Q17 or200Q17 values, or altered (e.g., increased or decreased) AQ17 or AQ20values relative to the reference polymerase.

In some embodiments, the modified Taq polymerase exhibits a change inany one or more of the following parameters: average read length,performance in high ionic strength solution, improved processivity,improved templating efficiency, improved thermostability, improvedperformance in emulsion PCR, reduced strand bias in GC or AT richsequences, or reduced systematic error relative to the referencepolymerase. In one embodiment, the change in one or more of theparameters is observed by comparing the performance of the referencepolymerase and modified polymerase under identical conditions.Optionally, the change in one or more parameters can be observed usingin an ion-based sequencing reaction.

In some embodiments, the modified polymerase can include at least oneamino acid substitution of an existing amino acid residue at theindicated position with any other amino acid residue (including bothnaturally occurring and non-natural amino acid residues). In someembodiments, the amino acid substitution is a conservative substitution;alternatively, the amino acid substitution can be a non-conservativesubstitution. In some embodiments, the reference polymerase, themodified Taq polymerase, or both the reference and modified Taqpolymerase can further include a deletion of the methionine residue atposition 1, or a substitution of the methionine residue at position 1with any other amino acid residue, wherein the numbering is relative tothe amino acid sequence of SEQ ID NO: 1

As the skilled artisan will readily appreciate, the scope of the presentdisclosure encompasses not only the specific amino acid and/ornucleotide sequences disclosed herein, but also, for example, manyrelated sequences encoding genes and/or peptides with the functionalproperties described herein. For example, the scope and spirit of thedisclosure encompasses any nucleotide and amino acid sequences encodingconservative variants of the various polymerases disclosed herein. Itwill also be readily apparent to the skilled artisan that the modifiedpolymerases disclosed herein by amino acid sequence can be converted tothe corresponding nucleotide sequence without undue experimentation, forexample using a number of freely available sequence conversionapplications (e.g., “in-silco”).

It is anticipated that one of skill in the art, having identified one ormore amino acid substitutions disclosed herein which impart a beneficialproperty to the modified polymerase (such as improved thermostability,improved accuracy, improved processivity, improved read length ascompared to a reference polymerase) can be transferred to differentpolymerase species or polymerase family without undue experimentation.As a result, once an amino acid mutation is identified in a polymerasethat provides altered catalytic or kinetic properties, the amino acidmutation can be screened using methods known to one of ordinary skill inthe art (such as amino acid or nucleotide sequence alignment), todetermine if the amino acid mutation can be readily transferred to adifferent polymerase, such as a different species, class or polymerasefamily. In some embodiments, the transferable (or homologous) amino acidmutation can include an amino acid mutation that enhances propertiessuch as increased read-length, increased raw accuracy, decreased strandbias, reduced systematic error, increased total sequencing throughput,increased error-free read length, increased processivity, increased AQvalues, and the like. In some embodiments, a transferable (orhomologous) amino acid mutation can include transferring one or moreamino acid mutations to another polymerase within, or between, DNApolymerase families, such as DNA polymerase family A or DNA polymerasefamily B. In some embodiments, a transferable (or homologous) amino acidmutation can include transferring one or more amino acid mutations toone or more polymerases within, or between, DNA polymerase families,such as across bacterial, viral, archaeal, eukaryotic or phage DNApolymerases.

In some embodiments, a modified polymerase according to the disclosurecan include a polymerase having one or more amino acid mutations (suchas a substitution, insertion, or deletion) that are homologous (ahomolog) to one or more of the amino acid mutations disclosed herein.For example, the disclosure includes with its scope, a modifiedpolymerase having one or more amino acid mutations that are homologousto any one of the amino acid mutations provided in SEQ ID NO: 2, SEQ IDNO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ IDNO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ IDNO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22,SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO:27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ IDNO: 32 and SEQ ID NO: 33. In some embodiments, a modified polymeraseaccording to the disclosure can include any polymerase having one ormore amino acid mutations that are homologous to one or more amino acidmutations provided herein for Taq DNA polymerase (e.g., one or morehomologous amino acid mutations that correspond to one or more aminoacid mutations of SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO:5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO:10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ IDNO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24,SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO:29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33). Onemethod for determining whether a polymerase is a homolog of one or moreof the modified polymerases disclosed herein, includes comparing theamino acid or nucleic acid sequence alignment of the modified polymeraseagainst the “test” polymerase. For example, the National Center forBiotechnology Information (NCBI) provides a variety of electronicdatabases (e.g., “HomoloGene” and “Protein Clusters”) that allows a userto determine whether an amino acid sequence is present as a homolog inanother organism.

In some embodiments, a modified polymerase or a biologically activefragment of a polymerase according to the disclosure can include apolymerase having one or more amino acid mutations that are homologousto one or more amino acid mutations of Taq DNA polymerase including, anyone or more amino acid mutations selected from the group consisting ofP6N, A77E, A97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295For E295N, E397V, G418C, L490Q, A502S, S543V, D578E, R593G, L678F orL678T, S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A,wherein the numbering is relative to the amino acid sequence of SEQ IDNO: 1.

In some embodiments, the modified polymerase or a biologically activefragment thereof includes two, three, four, five, or more amino acidmutations that are homologous to any two, three, four, five, or moreamino acid substitutions selected from the group consisting of P6N,A77E, A97V, L193V, K240I, R266Q, E267T, L287T, P291T, K292C, E295F orE295N, E397V, G418C, L490Q, A502S, S543V, D578E, R593G, L678F or L678T,S699W, E713W, V737A, E745T, L763F, E790G, E794C, E805I and L828A,wherein the numbering is relative to the amino acid sequence of SEQ IDNO: 1.

In some embodiments, a modified polymerase or any biologically activefragment of a polymerase having one or more amino acid mutations thatare homologous to an amino acid mutation disclosed herein for Taq DNApolymerase can optionally include at least one amino acid substitutiondesigned to replace a non-cysteine amino acid residue with a cysteineresidue. A person of ordinary skill will be readily able to determinenucleotide sequences that encode any of the amino acid sequences of thedisclosure based on the known correspondence between the nucleotidesequence and a corresponding protein sequence.

In some embodiments, the modified polymerase or any biologically activefragment of a polymerase can include one or more biotin moieties. Asused herein, the terms “biotin” and “biotin moiety” and their variantscomprise biotin (cis-hexahydro-2-oxo-1H-thieno[3,4]imidazole-4-pentanoicacid) and any derivatives and analogs thereof, including biotin-likecompounds. Such compounds include, for example, biotin-ε-N-lysine,biocytin hydrazide, amino or sulfhydryl derivatives of 2-iminobiotin andbiotinyl-ε-aminocaproic acid-N-hydroxysuccinimide ester,sulfosuccinimideiminobiotin, biotinbromoacetylhydrazide, p-diazobenzoylbiocytin, 3-(N-maleimidopropionyl)biocytin, and the like, as well as anybiotin variants that can specifically bind to an avidin moiety. Theterms “avidin” and “avidin moiety” and their variants, as used herein,comprise the native egg-white glycoprotein avidin, as well as anyderivatives, analogs and other non-native forms of avidin, which canspecifically bind to biotin moieties. In some embodiments, the avidinmoiety can comprise deglycosylated forms of avidin, bacterialstreptavidins produced by selected strains of Streptomyces, e.g.,Streptomyces avidinii, to truncated streptavidins, and to recombinantavidin and streptavidin as well as to derivatives of native,deglycosylated and recombinant avidin and of native, recombinant andtruncated streptavidin, for example, N-acyl avidins, e.g., N-acetyl,N-phthalyl and N-succinyl avidin, and the commercial productsExtrAvidin®, Captavidin®, Neutravidin® and Neutralite Avidin®. All formsof avidin-type molecules, including both native and recombinant avidinand streptavidin as well as derivatized molecules, e.g. nonglycosylatedavidins, N-acyl avidins and truncated streptavidins, are encompassedwithin the terms “avidin” and “avidin moiety”. Typically, but notnecessarily, avidin exists as a tetrameric protein, wherein each of thefour tetramers is capable of binding at least one biotin moiety. As usedherein, the term “biotin-avidin bond” and its variants refers to aspecific linkage formed between a biotin moiety and an avidin moiety.Typically, a biotin moiety can bind with high affinity to an avidinmoiety, with a dissociation constant K_(d) typically in the order of10⁻⁴ to 10⁻¹⁵ mol/L. Typically, such binding occurs via non-covalentinteractions.

In some embodiments, the modified polymerase or any biologically activefragment of a polymerase can include one or more modified or substitutedamino acids relative to the unmodified or reference polymerase, and canfurther include a biotin moiety that is linked to at least one of theone or more modified or substituted amino acids. The biotin moiety canbe linked to the modified polymerase using any suitable linking method.In some embodiments, the modified polymerase includes one or morecysteine replacement substitutions, and the linking moiety includes abiotin moiety that is linked to at least one of the one or more cysteinereplacement substitutions. In some embodiments, the modified polymerasecan be chemically modified to be reversibly inactivated such that it isactivated with heat (see e.g., U.S. Pat. No. 5,677,152, Birch et al.).In these embodiments, the modified polymerase is well suited for a hotstart amplification method, such as a hot start PCR method.

In some embodiments, the modified polymerase is a biotinylatedpolymerase. The term “biotinylated” and its variants, as used herein,refer to any covalent or non-covalent adduct of biotin with othermoieties such as biomolecules, e.g., proteins, nucleic acids (includingDNA, RNA, DNA/RNA chimeric molecules, nucleic acid analogs and peptidenucleic acids), proteins (including enzymes, peptides and antibodies),carbohydrates, lipids, etc.

In some embodiments, the disclosure also relates generally tocompositions (as well as related methods, kits, systems and apparatuses)comprising a modified polymerase including at least one amino acidmodification relative to a reference polymerase, wherein the modifiedpolymerase has improved processivity, improved thermostability and/orimproved accuracy relative to a reference polymerase.

In some embodiments, the disclosure relates generally to a method forincorporating at least one nucleotide into a primer, comprising:contacting a nucleic acid complex including a template nucleic acid withprimer and a modified polymerase in the presence of one or morenucleotides, and incorporating at least one of the one or morenucleotides into the primer in a template-dependent fashion using saidmodified polymerase.

Methods for nucleotide incorporation are well known in the art andtypically comprise use of a polymerase reaction mixture in which thepolymerase is contacted with the template nucleic acid under nucleotideincorporation conditions. When the nucleotide incorporation reactioncomprises polymerization of nucleotides onto the end of a primer, theprocess is typically referred to as “primer extension.” Typically, butnot necessarily such nucleotide incorporation occurs in atemplate-dependent fashion. Primer extension and other nucleotideincorporation assays are typically performed by contacting the templatenucleic acid with a polymerase in the presence of nucleotides in anaqueous solution under nucleotide incorporation conditions. In someembodiments, the nucleotide incorporation reaction can include a primer,which can optionally hybridize to the template to form a primer-templateduplex. Typical nucleotide incorporation conditions are achieved oncethe template, polymerase, nucleotides and optionally primer are mixedwith each other in a suitable aqueous formulation, thereby forming anucleotide incorporation reaction mixture (or primer extension mixture).The aqueous formulation can optionally include divalent cations and/orsalts, particularly Mg⁺⁺ and/or Ca⁺⁺ ions. The aqueous formulation canoptionally include divalent anions and/or salts, particularly SO₄ ²⁻.Typical nucleotide incorporation conditions have included well knownparameters for time, temperature, pH, reagents, buffers, reagents,salts, co-factors, nucleotides, target DNA, primer DNA, enzymes such asnucleic acid-dependent polymerase, amounts and/or ratios of thecomponents in the reactions, and the like. The reagents or buffers caninclude a source of monovalent ions, such as KCl, K-acetate,NH₄-acetate, K-glutamate, NH₄Cl, or ammonium sulfate. The reagents orbuffers can include a source of divalent ions, such as Mg²⁺ and/or Mn²⁺,MgCl₂, or Mg-acetate. In some embodiments, the reagents or buffers caninclude a source of detergent such as Triton and/or Tween. Mostpolymerases exhibit some levels of nucleotide incorporation activityover a pH range of about 5.0 to about 9.5, more typically between aboutpH 7 and about pH 9, sometimes between about pH 6 to about pH 8, andsometimes between pH 7 and 8. In some embodiments, a nucleotidepolymerization buffer can include chelating agents such as EDTA and/orEGTA, and the like. Although in some embodiments, nucleotideincorporation reactions may include buffering agents, such as Tris,Tricine, HEPES, MOPS, ACES, or MES, which can provide a pH range ofabout 5.0 to about 9.5, such buffering agents can optionally be reducedor eliminated when performing ion-based reactions requiring detection ofion byproducts. In some embodiments, the nucleotide incorporationreaction may include trehalose. Methods of performing nucleic acidsynthesis are well known and extensively practiced in the art andreferences teaching a wide range of nucleic acid synthesis techniquesare readily available. Some exemplary teachings regarding theperformance of nucleic acid synthesis (including, for example,template-dependent nucleotide incorporation, as well as primer extensionmethods) can be found, for example, in Kim et al., Nature 376: 612-616(2002); Ichida et al., Nucleic Acids Res. 33: 5214-5222 (2005); Pandeyet al., European Journal of Biochemistry, 214:59-65 (1993); Blanco etal., J. Biol. Chem. 268: 16763-16770 (1993); U.S. patent applicationSer. No. 12/002,781, now published as U.S. Patent Publication No.2009/0026082; U.S. patent application Ser. No. 12/474,897, now publishedas U.S. Patent Publication No. 2010/0137143; and U.S. patent applicationSer. No. 12/492,844, now published as U.S. Patent Publication No.2010/0282617; U.S. patent application Ser. No. 12/748,359, now publishedas U.S. Patent Publication No. 20110014612. Given the ample teaching ofprimer extension and other nucleotide incorporation reactions in theart, suitable reaction conditions for using the modified polymerases ofthe disclosure to perform nucleotide incorporation will be readilyapparent to the skilled artisan. In some embodiments, the methods (andrelated kits, apparatus, systems and compositions) can includeincorporation of one or more nucleotide analogs and/or reversibleterminators.

In some embodiments, the disclosure relates generally to reagents (e.g.,buffer compositions) and kits useful for performance of nucleotidepolymerization reactions using polymerases, including any of theexemplary modified polymerases described here. The nucleotidepolymerization reactions can include without limitation nucleotideincorporation reactions (including both template-dependent andtemplate-independent nucleotide incorporation reactions) as well asprimer extension reactions. In some embodiments, the buffer compositioncan include any one or more of the following: a monovalent metal salt, adivalent metal salt, a divalent anion, and a detergent. For example, thebuffer composition can include a potassium or sodium salt. In someembodiments, the buffer composition can include a manganese or magnesiumsalt. In some embodiments, the buffer composition can include a sulfatesuch as potassium sulfate and/or magnesium sulfate. In some embodiments,the buffer composition can include a detergent. In some embodiments, thebuffer compositions can include a detergent selected from the groupconsisting of Triton and Tween. In some embodiments, the buffer caninclude a reagent for a hot start amplification step, such as anoligonucleotide or aptamer.

In some embodiments, the buffer composition can include at least onepotassium salt, at least one manganese salt, and Triton X-100 (PierceBiochemicals). The salt can optionally include a chloride salt or asulfate salt. In some embodiments, the buffer composition can include apH of about 7.3 to about 8.0. In some embodiments, the buffercomposition can include a pH of about 7.4 to about 7.9. In someembodiments, the buffer composition includes a potassium salt atconcentrations of between 5-250 mM, 50-225 mM, 125-200 mM (depending onthe divalent).

In some embodiments, the buffer composition includes magnesium ormanganese salt at a concentration of between 1 mM and 20 mM. In someembodiments, the buffer composition includes magnesium or manganese saltat a concentration of between 6-15 mM.

In some embodiments, the buffer composition includes a sulfate at aconcentration of between 1 mM and 100 mM. In some embodiments, thebuffer composition includes a sulfate at a concentration of between 5-50mM.

In some embodiments, the buffer composition includes a detergent (e.g.,Triton X-100 or Tween-20) at a concentration of between 0.001% to 1%. Insome embodiments, the buffer composition includes a detergent (e.g.,Triton X-100 or Tween-20) at a concentration of between 0.0025-0.0125%.

In some embodiments, the disclosed modified polymerase compositions, (aswell as related methods, systems, apparatuses and kits) can be used toobtain sequence information from a nucleic acid molecule. Many methodsof obtaining sequence information from a nucleic acid molecule are knownin the art, and it will be readily appreciated that all such methods arewithin the scope of the present disclosure. Suitable methods ofsequencing using the disclosed modified polymerases include withoutlimitation: Sanger sequencing, ligation-based sequencing (also known assequencing by hybridization) and sequencing by synthesis.Sequencing-by-synthesis methods typically involve template-dependentnucleic acid synthesis (e.g., using a primer that is hybridized to atemplate nucleic acid or a self-priming template, as will be appreciatedby those of ordinary skill), based on the sequence of a template nucleicacid. That is, the sequence of the newly synthesized nucleic acid strandis typically complementary to the sequence of the template nucleic acidand therefore knowledge of the order and identity of nucleotideincorporation into the synthesized strand can provide information aboutthe sequence of the template nucleic acid strand. Sequencing bysynthesis using the modified polymerases of the disclosure willtypically involve detecting the order and/or identity of nucleotideincorporation when nucleotides are polymerized in a template-dependentfashion by the modified polymerase. In some embodiments,sequencing-by-synthesis can include optical single molecule sequencing(e.g., sequencing in the absence of labeled nucleotides). Alternatively,some exemplary methods of sequence-by-synthesis using labelednucleotides include single molecule sequencing (see, e.g., U.S. Pat.Nos. 7,329,492 and 7,033,764), which typically involve the use oflabeled nucleotides to detecting nucleotide incorporation. In someembodiments, the disclosed polymerase compositions (and related methods,kits, systems and apparatuses) can be used to obtain sequenceinformation. In some embodiments, the disclosed modified polymerase canbe used to obtain sequence information for whole genome sequencing,amplicon sequencing, targeted re-sequencing, single molecule sequencing,multiplex and/or barcoded sequencing, or paired end sequencingapplications, and the like.

In some embodiments, the disclosed modified polymerase compositions aswell as related methods, systems, apparatuses and kits, can be used toamplify nucleic acid molecules. In some embodiments, a nucleic acidmolecule can be amplified using a modified polymerase by any appropriatemethod. In some embodiments, a nucleic acid molecule can be amplifiedfor example by pyrosequencing, ion-based ISFET sequencing, PCR, emulsionPCR or bridge polymerase chain reaction.

In some embodiments, the disclosed modified polymerase compositions (aswell as related methods, systems, apparatuses and kits) can be used togenerate nucleic acid libraries. In some embodiments, the disclosedmodified polymerase compositions can be used to generate nucleic acidlibraries for a variety of downstream processes. Many methods forgenerating nucleic acid libraries are known in the art, and it will bereadily appreciated that all such methods are within the scope of thepresent disclosure. Suitable methods include, without limitation,nucleic acid libraries generated using emulsion PCR, bridge PCR, PCR,qPCR, RT-PCR, nested patch PCR, and other forms of nucleic acidamplification dependent on polymerization. In some embodiments, themethods can include template-dependent nucleic acid amplification. Insome embodiments, the methods can include a primer:template duplex or anucleic acid template to which the modified polymerase can performnucleotide incorporation. In some embodiments, the nucleic acid caninclude a single stranded nucleic acid with a secondary structure suchas a hair-pin or stem-loop that can provide a single-stranded overhangto which the modified polymerase can incorporate a nucleotide duringpolymerization. In some embodiments, methods for generating nucleic acidlibraries using one or more of modified polymerases according to thedisclosure can include the generation of a nucleic acid library of 50,100, 200, 300, 400, 500, 600, 700, 800, or more base pairs in length. Insome embodiments, the nucleic acid template to which the modifiedpolymerase can perform nucleotide incorporation can be attached, linkedor bound to a support, such as a solid support. In some embodiments, thesupport can include a planar support such as slide or flowcell. In someembodiments, the support can include a particle, such as a nucleic acidsequencing bead (e.g., an Ion Sphere™ particle (Life Technologies Corp.,CA).

In some embodiments, the disclosure relates generally to a method forgenerating a nucleic acid library comprising contacting a nucleic acidtemplate with a modified polymerase and one or more dNTPs underpolymerizing conditions; thereby incorporating one or more dNTPs intothe nucleic acid template to generate the nucleic acid library. In someembodiments, the method can further include generating a nucleic acidlibrary or sequencing a nucleic acid library in the presence of a highionic strength solution. In some embodiments, the disclosure relatesgenerally to a modified polymerase that retains polymerase activity inthe presence of a high ionic strength solution. In some embodiments, thehigh ionic strength solution can be at least 120 mM salt. In someembodiments, the high ionic strength solution can be from 125 mM to 200mM salt. In some embodiments, the salt can include a potassium and/orsodium salt. In some embodiments, the salt can include NaCl and/or KCl.In some embodiments, the high ionic strength solution can furtherinclude a sulfate. In some embodiments, a modified polymerase is capableof amplifying (and/or sequencing) a nucleic acid molecule in thepresence of a high ionic strength solution to a greater capacity (forexample as measured by accuracy) than a reference polymerase lacking oneor more of the corresponding amino acid mutations under identicalconditions. In some embodiments, a modified polymerase is capable ofamplifying (and/or sequencing) a nucleic acid molecule in the presenceof a high ionic strength solution to a greater capacity (for example asmeasured by thermostability) than a reference polymerase lacking one ormore of the amino acid mutations under identical conditions. In someembodiments, a modified polymerase is capable of amplifying (and/orsequencing) a nucleic acid molecule in the presence of a high ionicstrength solution to a greater capacity (for example as measured byprocessivity) than a reference polymerase lacking one or more of theamino acid mutations under identical conditions.

Optionally, the method further includes repeating the addition of one ormore dNTPs under polymerizing conditions to incorporate a plurality ofdNTPs into the nucleic acid template to generate the nucleic acidlibrary.

In some embodiments, the method can further include detecting anucleotide incorporation by-product during the polymerization. In someembodiments, the nucleotide incorporation by-product can include ahydrogen and/or phosphate ion.

In some embodiments, the method further includes determining theidentity of the incorporated dNTPs in the nucleic acid library. In someembodiments, the method further includes determining the number ofincorporated nucleotides in the nucleic acid library. In someembodiments, the detecting can further include sequencing the nucleicacid library.

In some embodiments, the disclosed modified polymerase compositions, (aswell as related methods, systems, apparatuses and kits) can be used todetect nucleotide incorporation through the generation of by-productformation during the nucleotide incorporation event. Many methods ofdetecting nucleotide incorporation by-products are known in the art, andit will be readily appreciated that all such methods are within thescope of the present disclosure. Suitable methods of nucleotideby-product detection include without limitation, detection of hydrogenion, inorganic phosphate, inorganic pyrophosphate, and the like. Severalof these by-product detection methods typically involvetemplate-dependent nucleotide incorporation.

In some embodiments, the modified polymerase of the present disclosurecan be used to perform label-free nucleic acid sequencing, and inparticular, ion-based nucleic acid sequencing. The concept of label-freenucleic acid sequencing, including ion-based nucleic acid sequencing,including the following references that are incorporated by reference intheir entirey: Rothberg et al, U.S. Patent Publication Nos.2009/0026082, 2009/0127589, 2010/0301398, 2010/0300895, 2010/0300559,2010/0197507, and 2010/0137143, which are incorporated by referenceherein in their entireties. Briefly, in such nucleic acid sequencingapplications, nucleotide incorporations are determined by detecting thepresence of natural byproducts of polymerase-catalyzed nucleic acidsynthesis reactions, including hydrogen ions, polyphosphates, PPi, andPi (e.g., in the presence of pyrophosphatase).

In a typical embodiment of ion-based nucleic acid sequencing, nucleotideincorporations are detected by detecting the presence and/orconcentration of hydrogen ions generated by polymerase-catalyzed nucleicacid synthesis reactions, including for example primer extensionreactions. In one embodiment, templates that are operably bound to aprimer and a polymerase and that are situated within reaction chambers(such as the microwells disclosed in Rothberg et al, cited above), aresubjected to repeated cycles of polymerase-catalyzed nucleotide additionto the primer (“adding step”) followed by washing (“washing step”). Insome embodiments, such templates may be attached as clonal populationsto a solid support, such as a microparticle, nucleic acid sequencingbead, or the like, and said clonal populations are loaded into reactionchambers. As used herein, “operably bound” means that a primer isannealed to a template so that the primer can be extended by apolymerase and that a polymerase is bound to such primer-templateduplex, or in close proximity thereof so that primer extension takesplace whenever sufficient nucleotides are supplied.

In each adding step of the cycle, the polymerase extends the primer byincorporating added nucleotide in a template-dependent fashion, suchthat the nucleotide is incorporated only if the next base in thetemplate is the complement of the added nucleotide. If there is onecomplementary base, there is one incorporation, if two, there are twoincorporations, if three, there are three incorporations, and so on.With each such incorporation there is a hydrogen ion released, andcollectively a population of templates releasing hydrogen ions changesthe local pH of the reaction chamber. In some embodiments, theproduction of hydrogen ions is proportional to (e.g., monotonicallyrelated) to the number of contiguous complementary bases in the template(as well as the total number of template molecules with primer andpolymerase that participate in an extension reaction). Thus, when thereare a number of contiguous identical complementary bases in the template(i.e. a homopolymer region), the number of hydrogen ions generated, andtherefore the magnitude of the local pH change, is proportional to thenumber of contiguous identical complementary bases. If the next base inthe template is not complementary to the added nucleotide, then noincorporation occurs and no hydrogen ion is released.

In some embodiments, after each step of adding a nucleotide, a washingstep is performed, in which an unbuffered wash solution at apredetermined pH is used to remove the nucleotide of the previous stepin order to prevent misincorporations in later cycles (incompleteextension). In some embodiments, after each step of adding a nucleotide,an additional step may be performed wherein the reaction chambers aretreated with a nucleotide-destroying agent, such as apyrase, toeliminate any residual nucleotides remaining in the chamber, therebyminimizing the probability of spurious extensions in subsequent cycles.In some embodiments, the treatment may be included as part of thewashing step itself.

In one exemplary embodiment, different kinds (or “types”) of nucleotidesare added sequentially to the reaction chambers, so that each reactionis exposed to the different nucleotide types one at a time. For example,nucleotide types can be added in the following sequence: dATP, dCTP,dGTP, dTTP, dATP, dCTP, dGTP, dTTP, and so on; with each exposurefollowed by a wash step. The cycles may be repeated for 50 times, 100times, 200 times, 300 times, 400 times, 500 times, 750 times, or more,depending on the length of sequence information desired. In someembodiments, the time taken to apply each nucleotide sequentially to thereaction chamber (i.e. flow cycle) can be varied depending on thesequencing information desired. For example, the flow cycle can in someinstances be reduced when sequencing long nucleic acid molecules toreduce the overall time needed to sequence the entire nucleic acidmolecule. In some embodiments, the flow cycle can be increased, forexample when sequencing short nucleic acids or amplicons. In someembodiments, the flow cycle can be about 0.5 seconds to about 3 seconds.In some embodiments, the flow cycle can be about 1 second to about 1.5seconds.

In one embodiment, the disclosure relates generally to a method ofdetecting a nucleotide incorporation, including: performing a nucleotideincorporation using a modified polymerase and generating one or morebyproducts of the nucleotide incorporation; and detecting the presenceof at least one of the one or more byproducts of the nucleotideincorporation, thereby detecting the nucleotide incorporation.

In some embodiments, the method can further include repeating theperforming and detecting steps at least once. In some embodiments, themodified polymerase exhibits increased read length and/or processivityrelative to a reference polymerase under otherwise similar or identicalreaction conditions.

In some embodiments, detecting the presence of the sequencing byproductincludes contacting the reaction mixture with a sensor capable ofsensing the presence of the sequencing byproduct. The sensor can includea field effect transistor, for example a chemFET or an ISFET. In someembodiments, sequencing byproducts of nucleotide incorporation caninclude a hydrogen ion, a dye-linked moiety, a polyphosphate, apyrophosphate or a phosphate moiety, and detecting the presence of thesequencing byproduct includes using an ISFET to detect the sequencingbyproduct. In some embodiments, the detecting step includes detecting ahydrogen ion using the ISFET.

In some embodiments, the modified polymerase includes a polymeraselinked to a bridging moiety. The bridging moiety is optionally linked tothe polymerase through one or more attachment sites within the modifiedpolymerase. In some embodiments, the bridging moiety is linked to thepolymerase through a linking moiety. The linking moiety can be linked toat least one of the one or more attachment sites of the polymerase. Insome embodiments, the polymerase of the modified polymerase includes asingle attachment site, and the bridging moiety is linked to thepolymerase through the single attachment site, either directly orthrough a linking moiety. In some embodiments, the single attachmentsite can be linked to a biotin moiety, and the bridging moiety caninclude an avidin moiety. In some embodiments, the bridging moiety islinked to the polymerase through at least one biotin-avidin bond. Insome embodiments, the modified polymerase exhibits increased read lengthand/or processivity and/or read accuracy, increased total throughput,reduced strand bias, lowered systematic error relative to a referencepolymerase under otherwise similar or identical reaction conditions.

In some embodiments, the disclosure relates generally to a method ofdetecting a change in ion concentration during a nucleotidepolymerization reaction, including: performing a nucleotidepolymerization reaction using a modified polymerase including apolymerase linked to a bridging moiety, wherein the concentration of atleast one type of ion changes during the course of the nucleotidepolymerization reaction; and detecting a signal indicating the change inconcentration of the at least one type of ion.

In some embodiments, the disclosure relates generally to a method ofdetecting a change in ion concentration during a nucleotidepolymerization reaction, including: performing a nucleotidepolymerization reaction using a modified polymerase including apolymerase linked to a bridging moiety, wherein the concentration of atleast one type of ion changes during the course of the nucleotidepolymerization reaction; and detecting a signal indicating the change inconcentration of the at least one type of ion.

In some embodiments, the method can further include repeating theperforming and detecting steps at least once. In some embodiments,detecting the change in concentration of the at least one type of ionincludes using a sensor capable of sensing the presence of thebyproduct. The sensor can include afield effect transistor, for examplea chemFET or an ISFET. In some embodiments, the at least type of ionincludes a hydrogen ion, a polyphosphate, a pyrophosphate or a phosphatemoiety, and detecting the change in concentration of the at least onetype of ion includes using an ISFET to detect the at least one type ofion. In some embodiments, the at least one type of ion includes ahydrogen ion, and detecting the presence of the at least one type of ionincludes detecting the hydrogen ion using an ISFET.

In some embodiments, the disclosure relates generally to methods (andrelated kits, systems, apparatuses and compositions) for performing anucleotide polymerization reaction comprising or consisting ofcontacting a modified polymerase or a biologically active fragmentthereof with a nucleic acid template in the presence of one or morenucleotides, where the modified polymerase or the biologically activefragment thereof includes one or more amino acid modifications relativeto a reference polymerase, and where the modified polymerase or thebiologically active fragment thereof has improved accuracy, coverageand/or processivity as compared to the reference polymerase, andpolymerizing at least one of the one or more nucleotides using themodified polymerase or the biologically active fragment thereof.

In some embodiments, the disclosure relates generally to methods (andrelated kits, systems, apparatuses and compositions) for performing anucleotide polymerization reaction comprising or consisting ofcontacting a modified polymerase or a biologically active fragmentthereof with a nucleic acid template in the presence of one or morenucleotides, where the modified polymerase or the biologically activefragment thereof includes one or more amino acid modifications relativeto a reference polymerase, and where the modified polymerase or thebiologically active fragment thereof has an increased thermostabilityrelative to the reference polymerase, and polymerizing at least one ofthe one or more nucleotides using the modified polymerase or thebiologically active fragment thereof. In some embodiments, the methodincludes polymerizing at least one of the one or more nucleotides usingthe modified polymerase or the biologically active fragment thereof inthe presence of a high ionic strength solution. In some embodiments, ahigh ionic strength solution can include a solution in excess of 100 mMKCl. In some embodiments, a high ionic strength solution includes asolution that is at least 120 mM KCl. In some embodiments, a high ionicstrength solution includes a solution that is 125 mM to 200 mM KCl.

In some embodiments, the method can further include polymerizing one ofthe at least one nucleotides in a template-dependent fashion. In someembodiments, the polymerizing is performed under thermocyclingconditions. In some embodiments, the method can further includehybridizing a primer to the nucleic acid template prior to, during, orafter the contacting, and where the polymerizing includes polymerizingone of the at least one nucleotides onto an end of the primer using themodified polymerase or the biologically active fragment thereof. In someembodiments, the polymerizing is performed in the proximity of a sensorthat is capable of detecting the polymerization of the at least onenucleotide by the modified polymerase or the biologically activefragment thereof. In some embodiments, the method can further includedetecting a signal indicating the polymerization of the at least onenucleotide by the modified polymerase or the biologically activefragment thereof using a sensor. In some embodiments, the sensor is anISFET. In some embodiments, the sensor can include a detectable label ordetectable reagent within the polymerizing reaction.

In some embodiments, the disclosure generally relates to methods (andrelated kits, apparatus, systems and compositions) for performingnucleic acid amplification comprising or consisting of generating anamplification reaction mixture having a modified polymerase typicallyhaving 70, 75, 80, 85, 90, 95, 96, 97, 98, or 99% sequence identity toSEQ ID NO:1 or SEQ ID NO:34, or a biologically active fragment thereof,a primer, a nucleic acid template, and one or more nucleotides, wherethe modified polymerase or the biologically active fragment thereofincludes one or more amino acid modifications relative to a referencepolymerase and has improved thermostability relative to the referencepolymerase; and subjecting the amplification reaction mixture toamplifying conditions, where at least one of the one or more nucleotidesis polymerized onto the end of the primer using the modified polymeraseor the biologically active fragment thereof. In some embodiments, themodified polymerase or the biologically active fragment thereof havingimproved thermostability relative to the reference polymerase (e.g., SEQID NO: 1 or SEQ ID NO: 34), comprises or consists of at least 80%identity to SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13,SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO:18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ IDNO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 orSEQ ID NO: 33.

In some embodiments, the disclosure generally relates to methods (andrelated kits, apparatus, systems and compositions) for performingnucleic acid amplification comprising or consisting of generating anamplification reaction mixture having a modified polymerase, typicallyhaving 70, 75, 80, 85, 90, 95, 96, 97, 98, or 99% sequence identity toSEQ ID NO:1 or SEQ ID NO:34, or a biologically active fragment thereof,a primer, a nucleic acid template, and one or more nucleotides, wherethe modified polymerase or the biologically active fragment thereofincludes one or more amino acid modifications relative to a referencepolymerase and has improved accuracy relative to the referencepolymerase; and subjecting the amplification reaction mixture toamplifying conditions, where at least one of the one or more nucleotidesis polymerized onto the end of the primer using the modified polymeraseor the biologically active fragment thereof. In some embodiments, themodified polymerase or the biologically active fragment having improvedaccuracy relative to the reference polymerase (e.g., SEQ ID NO: 1 or SEQID NO: 34) comprises or consists of at least 80% identity SEQ ID NO: 5,SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10,SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO:15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ IDNO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29,SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33. Inillustrative embodiments, the method is an emulsion PCR method. As such,it will be understood that the amplification reaction mixture is addedto an emulsion oil composition before exposing the nucleic acid templateto amplifying conditions. The addition can take place over a period ofseconds rather than all at once, and can occur while the emulsion oilcomposition is being agitated. The solution that includes the emulsionoil and the amplification reaction mixture can then be agitated forexample, for 30 seconds to 30 minutes, 1 minute to 20 minutes, 2 minutesto 10 minutes, or for example, for 5 minutes before being exposed toamplification conditions. The amplification can occur after dispensingthe reaction mixture after the agitation into PCR compatible places,which can then be loaded onto a thermocycler. In certain embodiments,the emulsion PCR is performed in a reaction mixture that includes 120 to200 mM salt, such as 120 mM to 150 mM KCl.

In some embodiments, the method further includes determining theidentity of the one or more nucleotides polymerized by the modifiedpolymerase. In some embodiments, the method further includes determiningthe number of nucleotides polymerized by the modified polymerase. Insome embodiments, at least 50% of the one or more nucleotidespolymerized by the modified polymerase are identified. In someembodiments, substantially all of the one or more nucleotidespolymerized by the modified polymerase are identified. In someembodiments, the polymerization occurs in the presence of a high ionicstrength solution. In some embodiments the high ionic strength solutioncomprises 125 mM to 200 mM salt. In some embodiments, the polymerizationoccurs in the presence of an ionic strength solution of at least 120 mMsalt. In some embodiments, the high ionic strength solution comprisesKCl and/or NaCl.

In some embodiments, the disclosure generally relates to methods (andrelated kits, systems, apparatuses and compositions) for performing anucleotide polymerization reaction comprising or consisting of mixing amodified polymerase or a biologically active fragment thereof with anucleic acid template in the presence of one or more nucleotides, wherethe modified polymerase or the biologically active fragment thereofincludes one or more amino acid modifications relative to a referencepolymerase (such as SEQ ID NO: 1 or SEQ ID NO: 34; and polymerizing atleast one of the one or more nucleotides using the modified polymeraseor the biologically active fragment thereof in the mixture. In someembodiments, the modified polymerase or the biologically active fragmentthereof has increased accuracy as determined by measuring increasedaccuracy in the presence of a high ionic strength solution. In someembodiments, the high ionic strength solution refers to a reactionmixture for performing nucleotide polymerization having at least 120 mMKCl. In some embodiments, a high ionic strength solution includes asolution that is 125 mM to 200 mM KCl.

In some embodiments, the methods (and related kits, apparatus, systemsand compositions) comprise a modified polymerase or a biologicallyactive fragment thereof comprising or consisting of at least 80%identity to SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13,SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO:18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ IDNO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 orSEQ ID NO: 33.

In some embodiments, the disclosure generally relates to methods (andrelated kits, systems, apparatus and compositions) for detectingnucleotide incorporation comprising or consisting of performing anucleotide incorporation reaction using a modified polymerase or abiologically active fragment thereof having at least 90% identity to SEQID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 or SEQ ID NO: 33, a nucleic acid template, and oneor more nucleotide triphosphates; generating the nucleotideincorporation; and detecting the nucleotide incorporation. Detectingnucleotide incorporation can occur via any appropriate means such asPAGE, fluorescence, dPCR quantitation, nucleotide by-product production(e.g., hydrogen ion or pyrophosphate detection; suitable nucleotideby-product detection systems include without limitation, next-generation(i.e. massively parallel, high throughput) sequencing platforms such asRain Dance, Roche 454, and Ion Torrent Systems)) or nucleotide extensionproduct detection (e.g., optical detection of extension products ordetection of labelled nucleotide extension products). In someembodiments, the methods (and related kits, systems, apparatus andcompositions) for detecting nucleotide incorporation include or consistof detecting nucleotide incorporation using a modified polymerase or abiologically active fragment thereof that includes at least 95% identityto SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6,SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11,SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO:16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ IDNO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30,SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33. In some embodiments, themethod of detecting nucleotide incorporation includes or consists ofdetecting nucleotide incorporation using a modified polymerase or abiologically active fragment thereof that includes at least 98% identityto SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6,SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11,SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO:16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ IDNO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30,SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33. In some embodiments, themethod of detecting nucleotide incorporation includes or consists ofdetecting nucleotide incorporation by a modified polymerase or abiologically active fragment thereof that includes at least 99% identityto SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6,SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11,SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO:16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ IDNO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30,SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33. In some embodiments, themethod further comprises determining the identity of one or morenucleotides in the nucleotide incorporation. In some embodiments, thebyproduct of the nucleotide incorporation is a hydrogen ion. In someembodiments, the byproduct of the nucleotide incorporation is apyrophosphate. In some embodiments, the byproduct of the nucleotideincorporation is a labeled nucleotide extension product. In someembodiments, the method of detecting nucleotide incorporation includesgenerating the nucleotide incorporation under emulsion PCR or bridge PCRconditions.

In some embodiments, the disclosure generally relates to methods (andrelated kits, systems, apparatus and compositions) for detecting achange in ion concentration during a nucleotide polymerization reactioncomprising or consisting of performing a first nucleotide polymerizationreaction on a nucleic acid template or nucleic acid library in thepresence of one of more nucleotides to be incorporated during the firstnucleotide polymerization reaction, wherein the first nucleotidepolymerization reaction includes a modified polymerase or a biologicallyactive fragment thereof having at least 80% identity to SEQ ID NO: 2,SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7,SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12,SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO:17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ IDNO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31,SEQ ID NO: 32 or SEQ ID NO: 33; and performing a second nucleotidepolymerization reaction, wherein the second nucleotide polymerizationreaction detects at least one type of ion concentration change duringthe course of the second nucleotide polymerization reaction and providesa signal indicating a change in ion concentration of the at least onetype of ion. In some embodiments, the ion is a hydrogen ion. In someembodiments, the ion is a pyrophosphate ion. In some embodiments, thesignal indicating a change in ion concentration is a relative increasein the production of hydrogen ions in the polymerization reaction. Insome embodiments, detection of at least one type of ion concentrationchange is monitored using an ISFET. In some embodiments, the modifiedpolymerase or the biologically active fragment from the first nucleotidepolymerization reaction comprises or consists of at least 150 contiguousamino acid residues of a polymerase having at least 90% identity to SEQID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 or SEQ ID NO: 33. In some embodiments, themodified polymerase or the biological active fragment from the firstnucleotide polymerization reaction comprises or consists of at least 200contiguous amino acid resides of the polymerase having at least 95%identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15,SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO:20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ IDNO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33. In someembodiments, the modified polymerase or the biological active fragmentfrom the first nucleotide polymerization reaction comprises or consistsof at least 250 contiguous amino acid resides of the polymerase havingat least 98% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ IDNO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19,SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO:24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ IDNO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33. Insome embodiments, the modified polymerase or the biological activefragment from the first nucleotide polymerization reaction comprises orconsists of a polymerase having at least 99% identity to SEQ ID NO: 2,SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7,SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12,SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO:17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ IDNO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31,SEQ ID NO: 32 or SEQ ID NO: 33.

In some embodiments, the disclosure generally relates to methods (andrelated kits, systems, apparatus and compositions) for amplifying anucleic acid comprising or consisting of contacting a nucleic acid witha polymerase or a biologically active fragment thereof comprising atleast 80% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ IDNO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ IDNO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19,SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO:24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ IDNO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 or SEQ ID NO: 33under suitable conditions for amplification of the nucleic acid; andamplifying the nucleic acid. In some embodiments, the amplifying isperformed using a polymerase chain reaction, emulsion polymerase chainreaction, isothermal amplification reaction, recombinase polymeraseamplification reaction, proximity ligation amplification, rolling circleamplification or strand displacement amplification. In some embodiments,the amplifying includes clonally amplifying the nucleic acid insolution. In some embodiments, the amplifying includes clonallyamplifying the nucleic acid on a solid support such as a nucleic acidbead, flow cell, nucleic acid array, or wells present on the surface ofthe solid support. In some embodiments, the amplifying is performedusing a polymerase or biologically active fragment comprising athermostable DNA polymerase. In some embodiments, the polymerase orbiologically active fragment comprises a DNA polymerase having improvedthermostability as compared to a reference polymerase, such as SEQ IDNO: 1 or SEQ ID NO: 34. In some embodiments, the polymerase orbiologically active fragment comprises a DNA polymerase having improvedaccuracy as compared to a reference polymerase, such as SEQ ID NO: 1 orSEQ ID NO: 34.

In some embodiments, the methods (and related kits, systems, apparatusand compositions) for amplifying a nucleic acid comprising contacting anucleic acid with a polymerase or a biologically active fragment thereofcomprising at least 90% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ IDNO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ IDNO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18,SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO:23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ IDNO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 orSEQ ID NO: 33 under suitable conditions for amplification of the nucleicacid; and amplifying the nucleic acid. In some embodiments, thepolymerase or biologically active fragment comprises a DNA polymerasehaving an improved average read length as compared to the average readlength obtained using a DNA polymerase encoded by SEQ ID NO: 1 or SEQ IDNO: 34 under identical amplification conditions.

In some embodiments, the methods for amplifying a nucleic acid comprisecontacting a nucleic acid with a polymerase or a biologically activefragment thereof comprising at least 95% identity to SEQ ID NO: 2, SEQID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ IDNO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ IDNO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22,SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO:27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ IDNO: 32 or SEQ ID NO: 33 under suitable conditions for amplification ofthe nucleic acid; and amplifying the nucleic acid. In some embodiments,the method includes a polymerase or biologically active fragment havingan improved average read length as compared to the average read lengthobtained using a DNA polymerase encoded by SEQ ID NO: 1 or SEQ ID NO: 34under identical amplification conditions.

In some embodiments, the methods for amplifying a nucleic acid comprisecontacting a nucleic acid with a polymerase or a biologically activefragment thereof comprising at least 98% identity to SEQ ID NO: 2, SEQID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ IDNO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ IDNO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22,SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO:27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ IDNO: 32 or SEQ ID NO: 33 under suitable conditions for amplification ofthe nucleic acid; and amplifying the nucleic acid. In some embodiments,the method includes a polymerase or biologically active fragment havingan improved average read length as compared to the average read lengthobtained using a DNA polymerase encoded by SEQ ID NO: 1 or SEQ ID NO: 34under identical amplification conditions.

In some embodiments, the methods for amplifying a nucleic acid comprisecontacting a nucleic acid with a polymerase or a biologically activefragment thereof comprising at least 99% identity to SEQ ID NO: 2, SEQID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ IDNO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ IDNO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22,SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO:27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ IDNO: 32 or SEQ ID NO: 33 under suitable conditions for amplification ofthe nucleic acid; and amplifying the nucleic acid. In some embodiments,the method includes a polymerase or biologically active fragment havingan improved average read length as compared to the average read lengthobtained using a DNA polymerase encoded by SEQ ID NO: 1 or SEQ ID NO: 34under identical amplification conditions.

In some embodiments, average read length is determined by analyzing theread length of the amplified nucleic acids obtained using one or more ofthe modified polymerase provided herein, across all reads to establishan average read length and comparing the average read length to theaverage read length obtained using the reference polymerase.

In some embodiments, the disclosure generally relates to methods foramplifying a nucleic acid comprising or consisting of contacting anucleic acid with a polymerase or a biologically active fragment thereofcomprising at least 80% identity to SEQ ID NO: 2, SEQ ID NO: 3, SEQ IDNO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ IDNO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18,SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO:23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ IDNO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ ID NO: 31, SEQ ID NO: 32 orSEQ ID NO: 33, under suitable conditions for amplification of thenucleic acid; and amplifying the nucleic acid. In some embodiments, theamplifying is performed by a polymerase or biologically active fragmenthaving improved templating efficiency as compared to a reference sample,such as SEQ ID NO: 1 or SEQ ID NO: 34. In some embodiments the methodfor amplifying a nucleic acid comprises amplifying the nucleic acidunder emulsion PCR conditions. In some embodiments the method foramplifying a nucleic acid comprises amplifying the nucleic acid underbridge PCR conditions. In some embodiments, the bridge PCR conditionsinclude hybridizing one or more of the amplified nucleic acids to asolid support. In some embodiments, the hybridized one or more amplifiednucleic acids can be used as a template for further amplification. Insome embodiments, the modified polymerase or biologically activefragment thereof comprises a polymerase that is derived from Thermusaquaticus DNA polymerase (Taq). SEQ ID NO: 1 is the full-length,wild-type, nucleic acid sequence of the DNA polymerase, Thermusaquaticus (Taq). In some embodiments, Taq DNA polymerase can be used asa reference polymerase in the methods, kits, apparatus, systems andcompositions described herein.

In some embodiments, the disclosure generally relates to methods (andrelated kits, systems, apparatus and compositions) for synthesizing anucleic acid comprising or consisting of incorporating at least onenucleotide onto the end of a primer using a modified polymerase or abiologically active fragment thereof having at least 90% identity to SEQID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ IDNO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ IDNO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21,SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO:26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30, SEQ IDNO: 31, SEQ ID NO: 32 or SEQ ID NO: 33. Optionally, the method furthercomprises detecting incorporation of the at least one nucleotide ontothe end of the primer. In some embodiments, the method further includesdetermining the identity of at least one of the at least one nucleotideincorporated onto the end of the primer. In some embodiments, the methodcan include determining the identity of all nucleotides incorporatedonto the end of the primer. In some embodiments, the method includessynthesizing the nucleic acid in a template-dependent manner. In someembodiments, the method can include synthesizing the nucleic acid insolution, on a solid support, or in an emulsion (such as emPCR).

In some embodiments, provided herein are kits that include at least twovessels, such as tubes, each containing one or more reaction mixture orreaction mixture component as provided herein, such as a nucleotidetri-phosphate and/or a buffer that is effective for nucleic acidpolymerization, amplification, and/or sequencing reactions. At least oneof the vessels includes a modified polymerase or biologically activefragment thereof, of the invention and one or more other tubes caninclude nucleotides and/or a buffer appropriate for one of the methodsprovided herein, for example. In some embodiments, the kits can bevirtual kits wherein a number of separate reagents are listed, marketedand/or sold together, such as on a web page or a smart phone applicationthat lists different reagents that can be purchased together.

In some embodiments, a kit useful for performing a nucleic acidpolymerization reaction includes a buffer, at least one type ofnucleotide triphosphate, and a modified polymerase or biologicallyactive fragment thereof, of the invention.

In some embodiments, a kit useful for performing a nucleic acidpolymerization reaction includes a buffer, at least one type ofnucleotide triphosphate, optionally a salt such as sodium chloride orpotassium chloride, and optionally MgCl₂, and a second tube comprising amodified polymerase of the invention, or biologically active fragmentthereof. The tube comprising the polymerase, can include stabilizers andother components such as glycerol and a detergent such as, for example,Tween-20 or NP-40. In some illustrative embodiments, the kit can includea third vessel that contains a solid support, such as beads, thatoptionally contain primers for performing a method provided herein. Incertain embodiment, the kit can further include a tube that includes oiluseful for forming emulsions for emulsion PCR and optionally emulsionstabilizers. For example, and not intended to be limiting, such tube caninclude a biocompatible mineral oil, Allox 4912, and Span 80,

In some embodiments, one tube of the kit includes a buffer compositionthat includes any one or more of the following: a monovalent metal salt,a divalent metal salt, a divalent anion, and a detergent. For example,the buffer composition can include a potassium or sodium salt. Forexample, the buffer composition can include a potassium or sodium salt.For example, the buffer composion can include 50 to 200 mM salt, 50 to100 mM salt, 120 to 200 mM salt, such as 120 to 150 mM KC. In someembodiments, the buffer composition can include a manganese or magnesiumsalt. In some embodiments, the buffer composition can include a sulfatesuch as potassium sulfate and/or magnesium sulfate. In some embodiments,the buffer composition can include a detergent. In some embodiments, thebuffer compositions can include a detergent such as Triton and/or Tween.

In some embodiments, the kit includes a tube with a buffer compositionthat includes at least one potassium salt, at least one manganese salt,and Triton X-100 (Pierce Biochemicals). The salt can optionally includea chloride salt or a sulfate salt. In some embodiments, the buffercomposition can include a pH of about 7.3 to about 8.0. In someembodiments, the buffer composition can include a pH of about 7.4 toabout 7.9. In some embodiments, the buffer composition includes apotassium salt at concentrations of between 5-250 mM, 50-225 mM, 125-200mM (depending on the divalent).

In some embodiments, the buffer composition included in a tube of a kitincludes magnesium or manganese salt at a concentration of between 1 mMand 20 mM. In some embodiments, the buffer composition includesmagnesium or manganese salt at a concentration of between 6-15 mM. Insome embodiments, the buffer composition includes a sulfate at aconcentration of between 1 mM and 100 mM. In some embodiments, thebuffer composition includes a sulfate at a concentration of between 5-50mM. In some embodiments, the buffer composition of the kit includes adetergent (e.g., Triton X-100 or Tween-20) at a concentration of between0.001% to 1%. In some embodiments, the buffer composition includes adetergent (e.g., Triton X-100 or Tween-20) at a concentration of between0.0025-0.0125%.

In some embodiments, the kit further includes a nucleic acid capturebead. These kits can include a tube with an emulsion oil such as mineraloil.

The following non-limiting examples are provided purely by way ofillustration of exemplary embodiments, and in no way limit the scope andspirit of the present disclosure. Furthermore, it is to be understoodthat any inventions disclosed or claimed herein encompass allvariations, combinations, and permutations of any one or more featuresdescribed herein. Any one or more features may be explicitly excludedfrom the claims even if the specific exclusion is not set forthexplicitly herein. It should also be understood that disclosure of areagent for use in a method is intended to be synonymous with (andprovide support for) that method involving the use of that reagent,according either to the specific methods disclosed herein, or othermethods known in the art unless one of ordinary skill in the art wouldunderstand otherwise. In addition, where the specification and/or claimsdisclose a method, any one or more of the reagents disclosed herein maybe used in the method, unless one of ordinary skill in the art wouldunderstand otherwise.

EXAMPLES Example 1: Production & Purification of Exemplary ModifiedPolymerases

Amino acid mutations were introduced via site-directed mutagenesis intoan exemplary reference polymerase having the amino acid sequence of SEQID NO: 1. In this example, wild-type full length Taq DNA Polymerase (832amino acids in length) was used as the reference polymerase from whichexemplary mutations were introduced via site-directed saturationmutagenesis.

Here, 831 amino acid residues of SEQ ID NO: 1 after methionine (at aminoacid position 1 of SEQ ID NO: 1) were substituted with every possibleamino acid, at each amino acid residue along the polymerase. Recombinantexpression constructs encoding these modified polymerases weretransformed into bacteria. Colonies containing expression constructswere inoculated into BRM media, grown to OD=0.600 and induced by addingIPTG to a final concentration of 1 mM. The cells were then grown for afurther 3 hours at 37° C.

The induced cells were centrifuged for 10 minutes at 6000 rpm,supernatant was discarded, and the cells were resuspended inresuspension buffer (10 mM Tris, pH 7.5, 100 mM NaCl). The resuspendedcells were sonicated at a setting of 60 (amplitude) for one minute, andthen placed on ice for 1 minute. The sonication was repeated in thismanner for a total of 5 times. The samples were incubated at 65° C. for10 minutes. The samples were centrifuged at 9000 rpm for 30 minutes. Thesupernatant was recovered and further purified over a Heparin column.

Purified polymerases were assessed for expression and/or polymeraseactivity as compared to SEQ ID NO: 1. The number of amino acid residuessubstituted along the entire length of the WT Taq DNA polymerase was831. The average number of amino acid variants observed at each aminoacid residue along the polymerase was 17.8 variants per amino acidresidue. The total number of polymerase clones (each consisting of asingle amino acid substitution as compared to SEQ ID NO: 1) achievedusing this method was 14,833. The total number of clones possessingcharacteristics that were substantially superior to characteristicsobserved for polymerase performance using SEQ ID NO: 1 under standardemulsion PCR conditions was 332 clones. These superior characteristicsor properties included thermostability and/or polymerase activity in 125mM KCl or NaCL and/or at least one of the following characteristics orproperties associated with a sequencing reaction when the mutantpolymerases were used in a template emulsion PCR amplification step ofthe nucleic acids analyzed in the sequencing reaction: read length,accuracy, strand bias, systematic error, and total sequencingthroughput. The number of clones that were assessed for secondarycharacterization was 31 clones, where each clone consisted of a singleamino acid substitution as compared to SEQ ID NO: 1. As provided herein,exemplary modified polymerases SEQ ID NO: 5 through SEQ ID NO: 33 aremodified polymerases consisting of a single amino acid substitution ascompared to SEQ ID NO: 1.

Example 2: Production & Purification of Exemplary Double, Triple, andQuadruple Modified Polymerases

Double, triple, and quadruple amino acid substitutions were introducedvia site-directed mutagenesis into an exemplary reference polymerasehaving amino acid sequence of SEQ ID NO: 1. In this example, wild-typefull length Taq DNA Polymerase (SEQ ID NO: 1) was used as the referencepolymerase from which exemplary double, triple, and quadruple amino acidmutations were introduced.

Here, modified polymerases were prepared according to Example 1. Severalof the 31 clones assessed for secondary characterization were used asthe basis to combine multiple single amino acid substitutions into thereference polymerase according to the method set forth in Example 1.

Briefly, clones of interest from Example 1 were assessed as possessingsuperior polymerase performance as compared to WT Taq DNA polymeraseunder identical emPCR conditions. The selected individual amino acidsubstitutions were then introduced into the reference polymerase (SEQ IDNO: 1) via site-directed mutagenesis to create a plurality of differentdouble, triple, and quadruple amino acid substitution polymerases.Recombinant expression constructs encoding these modified polymeraseswere transformed into bacteria. Colonies containing expressionconstructs were inoculated into BRM media, grown to OD=0.600 and inducedby adding IPTG to a final concentration of 1 mM. The cells were thengrown for a further 3 hours at 37° C.

The induced cells were centrifuged for 10 minutes at 6000 rpm,supernatant was discarded, and the cells were resuspended inresuspension buffer (10 mM Tris, pH 7.5, 100 mM NaCl). The resuspendedcells were sonicated at a setting of 60 (amplitude) for one minute, andthen placed on ice for 1 minute. The sonication was repeated in thismanner for a total of 5 times. The samples were incubated at 65° C. for10 minutes. The samples were centrifuged at 9000 rpm for 30 minutes. Thesupernatant was recovered and further purified over a Heparin column.

Purified double, triple, and quadruple polymerases were assessed forexpression and/or polymerase activity as compared to SEQ ID NO: 1. Asprovided herein, SEQ ID NO: 3 and SEQ ID NO: 4 represent exemplarydouble and triple amino acid substitution polymerases, respectively,possessing superior PCR performance under emulsion PCR conditions ascompared to WT Taq DNA Polymerase (SEQ ID NO:1). These superiorcharacteristics or properties included thermostability and/or polymeraseactivity in 125 mM KCl or NaCL and/or at least one of the followingcharacteristics or properties associated with a sequencing reaction whenthe mutant polymerases were used in a template emulsion PCRamplification step of the nucleic acids analyzed in the sequencingreaction: read length, accuracy, strand bias, systematic error, andtotal sequencing throughput. It is believed that these improvements, atleast in part, were the result of increased processivity.

SEQ ID NO: 3 consists of a double amino acid substitution polymerase(L763F+E805I) wherein the numbering is relative to WT Taq DNA polymerase(SEQ ID NO: 1).

SEQ ID NO: 4 consists of a triple amino acid substitution polymerase(E397V+E745T+L763F) wherein the numbering is relative to WT Taq DNApolymerase (SEQ ID NO: 1).

Example 3: Comparing Performance of Modified and Reference Polymerasesin Emulsion PCR

A modified isolated polymerase comprising a mutant Taq DNA polymerase(SEQ ID NO: 2) including an amino acid substitution E397V (wherein thenumbering is relative to the amino acid sequence of SEQ ID NO: 1) waspurified essentially as described in Example 1. Both the modifiedpolymerase (SEQ ID NO: 2) and a reference polymerase (SEQ ID NO:1)(control reaction) were then evaluated for performance in emulsionbased PCR reactions under identical conditions.

The library of nucleic acid molecules to be amplified under the emulsionPCR conditions included amplicons that were known to be particularlydifficult to amplify under standard emPCR conditions. The library ofnucleic acid molecules included 55 amplicons that included high orextreme GC content (>60% GC); 42 amplicons having high or extreme ATcontent (>60% AT); 299 amplicons that included homopolymer (HP) regionsof differing lengths (e.g., 2 HP-9 HP); 95 amplicons which prematurelyattenuated under standard emPCR conditions; 20 amplicons of 320 bpinsert length; and 20 amplicons of 420 bp insert length.

Briefly, the library of nucleic acid molecules was adapter-ligated andsize selected as described in User Guide for the Ion Fragment LibraryKit (Ion Torrent Systems, Part No. 4466464; Publication Part No. 4467320Rev B). The library of nucleic acid molecules were then amplified ontoIon Sphere™ particles (Ion Torrent Systems, Part No. 602-1075-01)essentially according to the protocols provided in the User Guide forthe Ion Xpress™ Template Kit v 2.0 (Ion Torrent Systems, Part No.4469004A) (incorporated by reference herein in its entirety) and usingthe reagents provided in the Ion Template Preparation Kit (Ion TorrentSystems/Life Technologies, Part No. 4466461), the Ion Template ReagentsKit (Ion Torrent Systems/Life Technologies, Part No. 4466462) and theIon Template Solutions Kit (Ion Torrent Systems/Life Technologies, PartNo. 4466463), except that the on-test or reference polymerase was usedin place of the polymerase provided in the kit. The amplified nucleicacid molecules were then loaded into a PGM™ 314 sequencing chip. Thechip was loaded into an Ion Torrent PGM™ Sequencing system (Ion TorrentSystems/Life Technologies, Part No. 4462917) and sequenced essentiallyaccording to the protocols provided in User Guide for the Ion SequencingKit v2.0 (Ion Torrent Systems/Life Technologies, Part No. 4469714 Rev A)and using the reagents provided in the Ion Sequencing Kit v2.0 (IonTorrent Systems/Life Technologies, Part No. 4466456) and the Ion ChipKit (Ion Torrent Systems/Life Technologies, Part No. 4462923). IonTorrent Systems is a subsidiary of Life Technologies Corp., Carlsbad,Calif.).

The resulting sequencing data using the reference or modified polymerasewas analyzed to measure AQ20 mean read length (MRL), strand bias, basecoverage, accuracy, sequencing throughput (Mb) and uniformity ofcoverage.

Using the standard software supplied with the PGM™ sequencing system,the sequencing reaction data using the reference or modified polymerasesfor emPCR were measured and compared. The exemplary modified polymeraseincluding the amino acid substitution E397V (SEQ ID NO: 2) providedsignificantly increased AQ20 MRL reads, reduced strand bias, increasedbase coverage, increased accuracy, increased sequencing throughput (Mb)and increased uniformity of coverage relative to the referencepolymerase (SEQ ID NO: 1)(data not shown).

The amino acid sequence corresponding to the modified polymeraseprovided in this example is SEQ ID NO: 2. It will be readily apparent toone of ordinary skill in the art, that any one or more of the modifiedpolymerases disclosed, or suggested, by the disclosure can be readilyconverted (e.g., reverse translated) to the corresponding nucleic acidsequence encoding the modified polymerase. It will also be apparent tothe skilled artisan that the nucleic acid sequence for each polypeptideis variable due to the degenerate nature of codons. For example, thereare six codons that code for leucine (CTT, CTC, CTA, CTG, TTA and TTG).Thus, the base at position 1 of this codon can be a C or T, position 2of this codon is always a T, and the base at position 3 can be T, C, Aor G. Accordingly, any modified polymerase disclosed or suggested by thedisclosure can be translated to any one or more of the degenerate codonnucleic acid sequences.

Example 4: Evaluating Modified Polymerase Performance in High IonicStrength Emulsion PCR

A variety of modified polymerases consisting of single amino acidsubstitutions were prepared according to Example 1. The modified DNApolymerases were then evaluated for performance in emulsion PCR togenerate nucleic acid libraries essentially according to Example 3. Thelibrary of nucleic acid molecules to be amplified under emulsion PCRconditions included amplicons that were known to be particularlydifficult to amplify under standard emPCR conditions (see Example 3).

In this example, a salt titration experiment was performed to determinethe functionality of the modified polymerases at high ionic strengthconditions. The salt titration included evaluation at 75 mM salt, 100 mMsalt and high ionic strength solution (125 mM salt). In this example,the high ionic strength condition included 125 mM KCl.

Briefly, the library of nucleic acid molecules was adapter-ligated andsize selected as described in User Guide for the Ion Fragment LibraryKit (Ion Torrent Systems, Part No. 4466464; Publication Part No. 4467320Rev B). The library of nucleic acid molecules were then amplified ontoIon Sphere™ particles (Ion Torrent Systems, Part No. 602-1075-01)essentially according to the protocols provided in the User Guide forthe Ion Xpress™ Template Kit v 2.0 (Ion Torrent Systems, Part No.4469004A) and using the reagents provided in the Ion TemplatePreparation Kit (Ion Torrent Systems/Life Technologies, Part No.4466461), the Ion Template Reagents Kit (Ion Torrent Systems/LifeTechnologies, Part No. 4466462) and the Ion Template Solutions Kit (IonTorrent Systems/Life Technologies, Part No. 4466463), except that theon-test or reference polymerase was used in place of the polymeraseprovided in the kit.

The amplified nucleic acid molecules were then loaded into a PGM™ 314sequencing chip. The chip was loaded into an Ion Torrent PGM™ Sequencingsystem (Ion Torrent Systems/Life Technologies, Part No. 4462917) andsequenced essentially according to the protocols provided in User Guidefor the Ion Sequencing Kit v2.0 (Ion Torrent Systems/Life Technologies,Part No. 4469714 Rev A) and using the reagents provided in the IonSequencing Kit v2.0 (Ion Torrent Systems/Life Technologies, Part No.4466456) and the Ion Chip Kit (Ion Torrent Systems/Life Technologies,Part No. 4462923). Ion Torrent Systems is a subsidiary of LifeTechnologies Corp., Carlsbad, Calif.).

FIGS. 1A-1E depict exemplary results of several modified polymerasesconsisting of a single amino acid substitution prepared according toExample 1 and evaluated under emPCR conditions at various saltconcentrations. The first bar (reading from left to right) at each saltconcentration represents the sequencing throughput obtained for eachmodified polymerase. The second bar (reading from left to right) at eachsalt concentration represents the mean read length (MRL) obtained foreach modified polymerase. The last bar (reading from left to right) ateach salt concentration represents the key signal (which is a control todemonstrate if emPCR occurred). Generally, the level of sequencingthroughput for each of the five modified polymerases presented,increased in reaction conditions that included 125 mM KCl compared to 75mM KCl during the emPCR reaction.

Example 5: Evaluating a Double Amino Acid Substitution Mutant Polymerasefor Performance in Emulsion PCR

A modified polymerase prepared according to Example 2 comprising adouble amino acid substitution of Taq DNA polymerase (E397V+E745T, wherethe numbering is relative to the amino acid residues of SEQ ID NO: 1)was compared to a Taq DNA polymerase having a single amino acidsubstitution (SEQ ID NO: 34) for performance in emulsion PCR (emPCR)reactions that generate nucleic acid libraries.

The nucleic acid libraries were produced from Rhodopseudomonas palustris(which is 5,459,213 base-pair circular chromosome having a GC content of65.05% (see Larimer et al., Nature Biotechnology, 2004, Volume 22,Number 1, pg 55-61) and evaluated under high ionic strength conditions(125 mM salt; here, 125 mM KCl).

The libraries obtained using the modified polymerases from the emPCRstep were applied downstream in an ion-based sequencing reaction usingthe Ion Torrent PGM™ sequencing system (Ion Torrent Systems, Part No.4462917).

Briefly, the library Rhodopseudomonas palustris was adapter-ligated andsize selected (here, the insert was a 420 bp insert) as described inUser Guide for the Ion Fragment Library Kit (Ion Torrent Systems, PartNo. 4466464; Publication Part No. 4467320 Rev B). The library of nucleicacid molecules were then amplified onto Ion Sphere™ particles (IonTorrent Systems, Part No. 602-1075-01) essentially according to theprotocols provided in the User Guide for the Ion Xpress™ Template Kit v2.0 (Ion Torrent Systems, Part No. 4469004A) and using the reagentsprovided in the Ion Template Preparation Kit (Ion Torrent Systems/LifeTechnologies, Part No. 4466461), the Ion Template Reagents Kit (IonTorrent Systems/Life Technologies, Part No. 4466462) and the IonTemplate Solutions Kit (Ion Torrent Systems/Life Technologies, Part No.4466463), except that the on-test or reference polymerase was used inplace of the polymerase provided in the kit.

The amplified library was then loaded into a PGM™ 314 sequencing chip.The chip was loaded into an Ion Torrent PGM™ Sequencing system (IonTorrent Systems/Life Technologies, Part No. 4462917) and sequencedessentially according to the protocols provided in User Guide for theIon Sequencing Kit v2.0 (Ion Torrent Systems/Life Technologies, Part No.4469714 Rev A) and using the reagents provided in the Ion Sequencing Kitv2.0 (Ion Torrent Systems/Life Technologies, Part No. 4466456) and theIon Chip Kit (Ion Torrent Systems/Life Technologies, Part No. 4462923).Ion Torrent Systems is a subsidiary of Life Technologies Corp.,Carlsbad, Calif.).

The resulting sequencing data obtained using the modified polymerasesduring high ionic strength emPCR was analyzed to measure the number ofAQ20 bases, accuracy, and the average perfect read length of the 420 bpinsert. The data for the exemplary sequencing run performed as outlinedin this example is shown in FIGS. 2A1-2B2. As can be seen, thesequencing data from 125 mM KCl emPCR conditions obtained from a doubleamino acid substitution polymerase (E397V+E745T) possessed improved AQ20reads (386 bp v. 359 bp), improved read accuracy (99.8% v. 99.6%), andimproved perfect read length (331 bp v. 290 bp) as compared to a singleamino acid substitution polymerase (SEQ ID NO: 34). Both modified DNApolymerases were capable of generating significant sequencing data, as aresult of having created substantial numbers of nucleic acid librariesduring the emPCR process under high ionic strength conditions. However,the double amino acid substitution polymerase outperformed the singleamino acid polymerase substitution (SEQ ID NO: 34) under identicalconditions.

Example 6: Amino Acid Substitution Taq Polymerase Mutants forPerformance in Emulsion PCR

A variety of modified polymerases prepared according to Example 1consisting of different single amino acid substitutions was compared toa Taq DNA polymerase mutant also consisting of a single amino acidsubstitution (SEQ ID NO: 34) for performance in emulsion PCR reactionsthat generate nucleic acid libraries. The nucleic acid libraries wereproduced from Rhodopseudomonas palustris and were evaluated undervarious ionic strength conditions (e.g., 75 mm to 150 mM KCl).

The libraries obtained using the modified polymerases from the emPCRwere applied downstream in an ion-based sequencing reaction using theIon Torrent PGM™ sequencing system (Ion Torrent Systems, Part No.4462917).

Briefly, the library Rhodopseudomonas palustris was adapter-ligated andsize selected (here, the insert was a 420 bp insert) as described inUser Guide for the Ion Fragment Library Kit (Ion Torrent Systems, PartNo. 4466464; Publication Part No. 4467320 Rev B). The library of nucleicacid molecules were then amplified onto Ion Sphere™ particles (IonTorrent Systems, Part No. 602-1075-01) essentially according to theprotocols provided in the User Guide for the Ion Xpress™ Template Kit v2.0 (Ion Torrent Systems, Part No. 4469004A) and using the reagentsprovided in the Ion Template Preparation Kit (Ion Torrent Systems/LifeTechnologies, Part No. 4466461), the Ion Template Reagents Kit (IonTorrent Systems/Life Technologies, Part No. 4466462) and the IonTemplate Solutions Kit (Ion Torrent Systems/Life Technologies, Part No.4466463), except that the on-test or reference polymerase was used inplace of the polymerase provided in the kit.

The amplified library was then loaded into a PGM™ 314 sequencing chip.The chip was loaded into an Ion Torrent PGM™ Sequencing system (IonTorrent Systems/Life Technologies, Part No. 4462917) and sequencedessentially according to the protocols provided in User Guide for theIon Sequencing Kit v2.0 (Ion Torrent Systems/Life Technologies, Part No.4469714 Rev A) and using the reagents provided in the Ion Sequencing Kitv2.0 (Ion Torrent Systems/Life Technologies, Part No. 4466456) and theIon Chip Kit (Ion Torrent Systems/Life Technologies, Part No. 4462923).Ion Torrent Systems is a subsidiary of Life Technologies Corp.,Carlsbad, Calif.).

The resulting exemplary sequencing data obtained using the modifiedpolymerases during emPCR was analyzed to measure the number of AQ20total base count, AQ17 mean, AQ20 mean, uniformity of coverage, strandbias, total base coverage, systematic sequencing error (SSE) among othercriteria. The data for the sequencing runs performed as outlined in thisexample is shown in FIGS. 3A-3B. As can be seen, the sequencing dataobtained from several of the single amino acid substitution polymerasespossessed improved AQ20 total base count, improved AQ20 reads, reducedstrand bias and reduced SSE as compared to the single amino acidsubstitution polymerase (SEQ ID NO: 34). For example, the single aminoacid substitution polymerases (E397V, E794C or R593G) each outperformedthe single amino acid polymerase substitution (SEQ ID NO: 34) underidentical conditions.

Example 7: Evaluating Mutant Polymerases with Respect to Thermostability(GC Coverage)

A modified polymerase comprising a single amino acid substitution(E397V) of Taq DNA polymerase having the amino acid sequence of SEQ IDNO: 2 was compared to a different single amino acid substitution of TaqDNA polymerase (SEQ ID NO: 34) for performance in emulsion PCR reactionsthat generate nucleic acid libraries. The nucleic acid libraries wereproduced from Rhodopseudomonas palustris which contains a GC content ofover 65%. The ability of each mutant polymerase to function during emPCRunder high ionic strength solution was evaluated (e.g., 125 mM KCl).

The libraries obtained using the modified Taq DNA polymerases from theemPCR step were applied downstream in an ion-based sequencing reactionusing the Ion Torrent PGM™ sequencing system (Ion Torrent Systems, PartNo. 4462917).

Briefly, the library Rhodopseudomonas palustris was adapter-ligated andsize selected (here, the insert was a 420 bp insert) as described inUser Guide for the Ion Fragment Library Kit (Ion Torrent Systems, PartNo. 4466464; Publication Part No. 4467320 Rev B). The library of nucleicacid molecules were then amplified onto Ion Sphere™ particles (IonTorrent Systems, Part No. 602-1075-01) essentially according to theprotocols provided in the User Guide for the Ion Xpress™ Template Kit v2.0 (Ion Torrent Systems, Part No. 4469004A) and using the reagentsprovided in the Ion Template Preparation Kit (Ion Torrent Systems/LifeTechnologies, Part No. 4466461), the Ion Template Reagents Kit (IonTorrent Systems/Life Technologies, Part No. 4466462) and the IonTemplate Solutions Kit (Ion Torrent Systems/Life Technologies, Part No.4466463), except that the on-test or reference polymerase was used inplace of the polymerase provided in the kit.

The amplified library was then loaded into a PGM™ 314 sequencing chip.The chip was loaded into an Ion Torrent PGM™ Sequencing system (IonTorrent Systems/Life Technologies, Part No. 4462917) and sequencedessentially according to the protocols provided in User Guide for theIon Sequencing Kit v2.0 (Ion Torrent Systems/Life Technologies, Part No.4469714 Rev A) and using the reagents provided in the Ion Sequencing Kitv2.0 (Ion Torrent Systems/Life Technologies, Part No. 4466456) and theIon Chip Kit (Ion Torrent Systems/Life Technologies, Part No. 4462923).Ion Torrent Systems is a subsidiary of Life Technologies Corp.,Carlsbad, Calif.).

The resulting exemplary sequencing data obtained using the modified TaqDNA polymerases as outlined in this example is shown in FIG. 4. In FIG.4, “Taglr1” refers to SEQ ID NO: 34 (D732R); while reference to “Hit #1”refers to amino acid substitution “E397V” (SEQ ID NO: 2). Taglr1 orTaq-LR1 polymerase was engineered to have higher template affinity thanWild type Taq polymerase. This increased template affinity has allowedfor templating of longer libraries, presumably caused by improvedfidelity and processivity. As is evident from the data, the single aminoacid substitution polymerase “E397V” outperforms with respect to readlength, sequencing throughput, and uniformity of coverage as compared toTaq-LR1 (SEQ ID NO: 34) under identical emPCR conditions.

Additionally, the nucleic acid library amplified during emPCR was alibrary from a species of bacteria having at least 65% GC content. It iswell known in the art that nucleic acid amplification of nucleic acidmolecules having high GC content is usually difficult compared to non-GCrich targets (McDowell et al., Nucleic Acids Research, 1998; 26:3340-3347). It is also well known in the art, that GC content ispredictive of melting temperature. Thus, a high GC content genome willpossess a higher melting temperature than a lower GC content genome.Here, the “E397V” modified polymerase offered substantially greateramplification of a high GC content nucleic acid library as compared toSEQ ID NO: 34. Based on the overall sequencing data in this example, itwas determined that the modified polymerase (SEQ ID NO: 2) resulted inless than 5 gaps of genomic coverage per gigabyte of sequencing data;while SEQ ID NO: 34 offered at least 99 gaps of genomic coverage pergigabyte of sequencing data.

The ability to accurately and faithfully sequence the content of GC richorganisms and GC rich regions is useful in a variety of fields,including bacterial studies, where several genuses of bacteria possessgreater than 65% GC content (e.g., some species of Streptomyces andMycobacterium). A polymerase that generates 99 or more gaps of genomiccoverage per gigabyte of sequencing data is less useful for DNAsequencing, detection methods, and the like, than a polymerase thatproduces a genomic map having fewer than 20, 10, or 5 gaps of genomiccoverage per gigabyte of sequencing data. In order for a user tocomplete the genome using the former polymerase, a user would need toperform additional amplifications of the genome using “finishingreactions” which comprise designing and purchasing a pair of primers foreach gap per gigabyte of sequencing data. Once successfully designed,the 99 or more primer pair reactions per gigabyte of sequencing datamust undergo sufficient amplification to span the gaps present acrossthe entire genome. However, using the latter polymerase (e.g., SEQ IDNO: 2), a user could establish most of the genome of Rhodopseudomonaspalustris in a single emPCR reaction. Only if necessary, would the userneed to prepare 5 primer pairs per gigabyte of sequencing data tocomplete the genome.

Not to be limited by theory, a modified polymerase having improved GCcontent coverage as defined herein can also correspond to a modifiedpolymerase having improved thermostability (see Example 10). DuringemPCR, the nucleic acid library must be denatured in order for theamplification step to proceed. If the GC content of the nucleic acidlibrary is high, it is likely that the nucleic acid library will be lesslikely to denature, or that any primers present will correctly anneal tothe template strand, and as a result, it is less likely that themodified polymerase can initiate polymerization of the primers. Modifiedpolymerase “E397V” was found to possess higher thermostability at 96° C.as compared to SEQ ID NO: 1 (see Example 10). Modified polymerase“E397V” also demonstrated higher coverage uniformity and longer readlength at 97° C. as compared to the same reaction at 96° C. (see FIG.4.) Accordingly, modified polymerase “E397V” possesses greaterthermostability as compared to SEQ ID NO: 1 under identical emPCRconditions.

Example 8: Evaluating Double Amino Acid Substitution Mutants forPolymerase Performance in Emulsion PCR

In this example, four polymerases each having a double amino acidsubstitution (E397V+E745T; P6N+E295F; P6N+E397V; or E745T+E794C) wereprepared according to Example 2 and compared to a Taq DNA polymerasehaving a single amino acid substitution (SEQ ID NO: 34) for performancein emPCR reactions under high ionic strength conditions.

The nucleic acid libraries were prepared using an E. coli 500 bp insert.The nucleic acid libraries obtained from the emPCR reactions wereapplied downstream in an ion-based sequencing reaction using the IonTorrent PGM™ sequencing system (Ion Torrent Systems, Part No. 4462917).

Briefly, the template DNA was purified, adapter-ligated and sizeselected as described in User Guide for the Ion Fragment Library Kit(Ion Torrent Systems, Part No. 4466464; Publication Part No. 4467320 RevB). The library of nucleic acid molecules were then amplified onto IonSphere™ particles (Ion Torrent Systems, Part No. 602-1075-01)essentially according to the protocols provided in the User Guide forthe Ion Xpress™ Template Kit v 2.0 (Ion Torrent Systems, Part No.4469004A) and using the reagents provided in the Ion TemplatePreparation Kit (Ion Torrent Systems/Life Technologies, Part No.4466461), the Ion Template Reagents Kit (Ion Torrent Systems/LifeTechnologies, Part No. 4466462) and the Ion Template Solutions Kit (IonTorrent Systems/Life Technologies, Part No. 4466463), except that theon-test or reference polymerase was used in place of the polymeraseprovided in the kit.

The amplified library was then loaded into a PGM™ 318 sequencing chip.The chip was loaded into an Ion Torrent PGM™ Sequencing system (IonTorrent Systems/Life Technologies, Part No. 4462917) and sequencedessentially according to the protocols provided in User Guide for theIon Sequencing Kit v2.0 (Ion Torrent Systems/Life Technologies, Part No.4469714 Rev A) and using the reagents provided in the Ion Sequencing Kitv2.0 (Ion Torrent Systems/Life Technologies, Part No. 4466456) and theIon Chip Kit (Ion Torrent Systems/Life Technologies, Part No. 4462923).Ion Torrent Systems is a subsidiary of Life Technologies Corp.,Carlsbad, Calif.).

The resulting sequencing data obtained using the modified polymerasesunder high ionic strength emPCR (125 mM KCl) was analyzed to measure thenumber of AQ20 bases, raw read accuracy, and true accuracy forhomopolymers (HP) of 5 bp, 6 bp, or 7 bp in length. The data for theexemplary sequencing runs performed as outlined in this example is shownin FIGS. 5A-5B. As can be seen, the sequencing data from all fourpolymerase consisting of a double amino acid substitution outperformedthe single amino acid substitution polymerase (SEQ ID NO: 34) in allobserved metrics.

Example 9: Evaluating Double & Triple Amino Acid Substitution PolymeraseMutants for Performance in Emulsion PCR

In this example, three polymerases, each having one or more amino acidsubstitutions were prepared according to Example 1 or Example 2 andcompared to a Taq DNA polymerase having a single amino acid substitution(SEQ ID NO: 34) for performance in emPCR reactions under high ionicstrength conditions (e.g., 125 mM KCl).

The nucleic acid libraries were prepared using an E. coli 500 bp insert.The nucleic acid libraries obtained from the emPCR reactions wereapplied downstream in an ion-based sequencing reaction using the IonTorrent PGM™ sequencing system (Ion Torrent Systems, Part No. 4462917).

Briefly, the template DNA was purified, adapter-ligated and sizeselected as described in User Guide for the Ion Fragment Library Kit(Ion Torrent Systems, Part No. 4466464; Publication Part No. 4467320 RevB). The library of nucleic acid molecules were then amplified onto IonSphere™ particles (Ion Torrent Systems, Part No. 602-1075-01)essentially according to the protocols provided in the User Guide forthe Ion Xpress™ Template Kit v 2.0 (Ion Torrent Systems, Part No.4469004A) and using the reagents provided in the Ion TemplatePreparation Kit (Ion Torrent Systems/Life Technologies, Part No.4466461), the Ion Template Reagents Kit (Ion Torrent Systems/LifeTechnologies, Part No. 4466462) and the Ion Template Solutions Kit (IonTorrent Systems/Life Technologies, Part No. 4466463), except that theon-test or reference polymerase was used in place of the polymeraseprovided in the kit.

The amplified library was then loaded into a PGM™ 318 sequencing chip.The chip was loaded into an Ion Torrent PGM™ Sequencing system (IonTorrent Systems/Life Technologies, Part No. 4462917) and sequencedessentially according to the protocols provided in User Guide for theIon Sequencing Kit v2.0 (Ion Torrent Systems/Life Technologies, Part No.4469714 Rev A) and using the reagents provided in the Ion Sequencing Kitv2.0 (Ion Torrent Systems/Life Technologies, Part No. 4466456) and theIon Chip Kit (Ion Torrent Systems/Life Technologies, Part No. 4462923).Ion Torrent Systems is a subsidiary of Life Technologies Corp.,Carlsbad, Calif.).

The resulting sequencing data obtained using the modified polymerasesunder high ionic strength emPCR was analyzed to measure the number ofAQ20 bases, raw accuracy, systematic sequencing error (SSE) and totalsequencing throughput (AQ20 bases) among other metrics. The data for theexemplary sequencing runs performed as outlined in this example is shownin FIG. 6. As can be seen, the sequencing data from the mutantpolymerase “E397V” outperformed the single amino acid substitutionpolymerase (SEQ ID NO: 34) based on all observed metrics. The “E397V”modified polymerase also outperformed both the double and triple aminoacid substitution polymerases under identical emPCR conditions.

Example 10: Comparing Thermostability Performance of Modified andReference Polymerases

In this example, various modified polymerases containing one or moreamino acid substitutions were prepared according to Example 1 or Example2. Modified polymerase “E397V” consisting of a single amino acidsubstitution at amino acid residue 397 relative to the numbering ofamino acid residues of SEQ ID NO: 1 was prepared according to Example 1.Modified polymerase “SEQ ID NO: 34” consisting of a single amino acidsubstitution as compared to SEQ ID NO: 1 was prepared according toExample 1.

Modified polymerase “E794C+E805I” consisted of a double amino acidsubstitution relative to the amino acid numbering of SEQ ID NO: 1 andwas prepared according to Example 2. Additionally, modified polymerase“E397V+E745T” consisted of a double amino acid substitution relative tothe amino acid numbering of SEQ ID NO: 1 and was prepared according toExample 2.

The polymerases described above were each prepared for thermostabilitytesting as a PCR strip as follows for thermocycling at 95° C.: 15 mMTris pH 7.5, 100 mM KCl, 30% Trehalose, 0.1% NP40 (a detergent) and 50nM of polymerase (see FIG. 14).

The PCR strips were incubated at various time points of heat treatment(no heat control=0 min; 2 mins; 4 mins; 6 mins or 8 mins). After heattreatment at 95° C. or 96° C. was completed, the reaction mixtures wereplaced on ice. The reaction mixtures were then transferred to plates forpolymerase activity assays.

Here, the polymerase activity assay was prepared as follows: 15 mM TrispH 7.5, 100 mM KCl, 8 mM MgCl₂, 150 nm Oligo 221 and 5 nM of polymerasereaction mixture from the heat-treatment step (10 ul) were combined.Oligo 221 is a hairpin oligo with a fluorescent dye attached(TTTTTTTGCAGGTGACAGGTTTTTCCTGTCACCXGC (SEQ ID NO: 50), where X is afluorescein-dT residue). Upon addition of dATP, oligo 221 is extended,resulting in release of the florescence (See Nikiforov, AnalyticalBiochemistry, (2011) 229-236, incorporated herein by reference in itsentirety).

In order to initiate the polymerase activity assay, 20 uM of dATP wasadded to each reaction. Changes in fluorescence for each heat-treatedpolymerase at each time point (0, 2, 4, 6 and 8 mins) were measured andplotted (see FIGS. 7-10). Here, the fluorescence signal at 525 nm wasmeasured over a period of time using an excitation wavelength of 490 nm.

The thermostability of a variety of additional single or double aminoacid substitution polymerases prepared according to Example 1 or Example2 was also assessed as outlined in this example.

FIG. 11 provides exemplary thermostability data obtained for a pluralityof single or double amino acid mutant polymerases compared to SEQ ID NO:34 (TAQ LR1) at 95° C. in the presence of Trehalose. Mutant polymerase“E397V” and mutant polymerase “G418C” demonstrated greatestthermostability at 95° C. under the test conditions. It should be notedthat the amino acid residue numbers in FIGS. 11-14 represent the residuethat is mutated in the following manner: P6N, A77E, A97V, L193V, K240I,R266Q, E267T, L287T, P291T, K292C, E295F or E295N, E397V, G418C, L490Q,A502S, S543V, D578E, R593G, L678F or L678T, S699W, E713W, V737A, E745T,L763F, E790G, E794C, E805I and L828A. Two numbers separated by a slashrepresent a double mutant with the above mutations at the notedresidues.

FIG. 12 provides exemplary thermostability data obtained for the samesingle or double amino acid mutant polymerases as compared against WTTaq DNA polymerase (SEQ ID NO: 1)(TAQ WT) and SEQ ID NO: 34 (TAQ LR1) at96° C. in the presence of Trehalose. Mutant polymerase “E397” and mutantpolymerase “G418C” demonstrated greatest thermostability at 96° C. underthe test conditions.

FIG. 13 provides exemplary thermostability data obtained for the samesingle or double amino acid mutant polymerases as compared against WTTaq DNA polymerase (TAQ WT) at 95° C. in the absence of Trehalose(during the heat-treatment step). Here, mutant polymerase “E397V” andmutant polymerase “G418C” demonstrated superior thermostability at 95°C. in the absence of trehalose as compared to WT Taq (SEQ ID NO: 1)under the test conditions.

It will be apparent to one of ordinary skill in the art that theforegoing thermostability assay is provided as an exemplarythermostability assay and is not meant as limiting or restricting in anymanner. As such, other variations to the thermostability assay providedherein, or other forms of thermostability assays or other means by whichto assess residual polymerase activity are contemplated within the scopeof the present invention.

What is claimed is:
 1. A composition comprising an isolated polypeptidehaving at least 95% identity to SEQ ID NO: 1 wherein the polypeptideincludes one or more amino acid substitutions selected from the groupconsisting of L763F, L763F+E805I, or L763F+E397V+E745T, wherein theisolated polypeptide or the biologically active fragment thereof,exhibits polymerase activity and an improvement relative to a referencepolymerase of SEQ ID NO:1 and/or SEQ ID NO:34, in one or more propertiesselected from thermostability and/or a sequencing property selected fromread length, accuracy, strand bias, systematic error, and totalsequencing throughput when the polypeptide is used in an emulsion PCRtemplate amplification step of a sequencing workflow.
 2. A compositionaccording to claim 1, wherein the isolated polypeptide or thebiologically active fragment thereof, has improved thermostability at95° C. for 6 minutes as compared to the thermostability of SEQ ID NO: 1at 95° C. for 6 minutes.
 3. A composition according to claim 2, whereinthe isolated polypeptide or the biologically active fragment thereof,includes L763F.
 4. A composition according to claim 2, wherein theisolated polypeptide or the biologically active fragment thereof,includes L763F+E805I.
 5. The composition of claim 3, wherein theisolated polypeptide or the biologically active fragment thereof,includes L763F+E397V+E745T.
 6. A reaction mixture comprising themodified polymerase of claim 1, a primer, a nucleic acid template, andone or more nucleotides.
 7. The reaction mixture of claim 6, wherein thereaction mixture further comprises a solid support.
 8. The compositionof claim 1, wherein the isolated polypeptide or biologically activefragment thereof, further includes P6N, E295F, and/or E794C.
 9. Thecomposition of claim 1, wherein the isolated polypeptide has at least98% identity to SEQ ID NO:
 1. 10. A method for amplifying a nucleicacid, comprising contacting the nucleic acid with a modified polymeraseaccording to claim 1 or a biologically active fragment thereof, undersuitable conditions for amplifying the nucleic acid, and amplifying thenucleic acid.
 11. A method according to claim 10, wherein the modifiedpolymerase has improved thermostability at 95° C. for 6 minutes ascompared to the thermostability of SEQ ID NO: 1 at 95° C. for 6 minutes.12. A method according to claim 11, wherein the modified polymerase orthe biologically active fragment thereof, includes L763F.
 13. A methodaccording to claim 11, wherein the modified polymerase or thebiologically active fragment thereof, includes L763F+E805I.
 14. Themethod of claim 13, wherein the modified polymerase or the biologicallyactive fragment thereof, includes L763F+E397V+E745T.
 15. The method ofclaim 10, wherein the modified polymerase or biologically activefragment thereof further includes P6N, E295F, and/or E794C.
 16. Themethod of claim 13, wherein the suitable conditions comprise suitableconditions for performing a polymerase chain reaction.
 17. The method ofclaim 10, wherein the suitable conditions comprise suitable conditionsfor performing an emulsion polymerase chain reaction.
 18. The method ofclaim 10, wherein the amplifying is clonally amplifying the nucleic acidin solution or on a solid support.
 19. The method of claim 10, furthercomprising determining the nucleic acid sequence of at least a portionof the nucleic acid.
 20. The method of claim 10, wherein the modifiedpolymerase has at least 98% sequence identity to SEQ ID NO:1.